Как удалить pycache из репозитория
Перейти к содержимому

Как удалить pycache из репозитория

  • автор:

Delete Python’s __pycache__ directories

First things first, actually I don’t want to remove them at all, I just want to get rid of them forever from my workflow. We will see that later.

This is a compherensive guide to deal with Python’s __pycache__ directories and .pyc files.

__pycache__ is a directory which is containing Python 3 bytecode compiled and ready to be executed.

From the official python tutorial Modules:

To speed up loading modules, Python caches the compiled version of each module in the pycache directory under the name module.version.pyc, where the version encodes the format of the compiled file; it generally contains the Python version number. For example, in CPython release 3.6 the compiled version of spam.py would be cached as pycache/spam.cpython-36.pyc.

When a module is imported for the first time (or when the source file has changed since the current compiled file was created) a .pyc file containing the compiled code should be created in a pycache subdirectory of the directory containing the .py file. The .pyc file will have a filename that starts with the same name as the .py file, and ends with .pyc, with a middle component that depends on the particular python binary that created it.

So they are necessary and should not be deleted usually.

Solution — Default

There are a lot elements to “remove” __pycache__ directories and .pyc files.

The first and foremost thing is that we need to do ignoring them in our project’s git trees. Add them into our .gitignore file(s), locally and/or globally.

Editor

However this is not enough, even though we already added them into .gitignore file(s), we will still see them in our IDEs and explorers. In order to prevent that I use below configurations per editor.

VSCode

Neovim

Vim

Docker

There is a environment variable which is responsible to disable to write .pyc files named PYTHONDONTWRITEBYTECODE .

PYTHONDONTWRITEBYTECODE

If this is set to a non-empty string, Python won’t try to write .pyc files on the import of source modules. This is equivalent to specifying the -B option.

It’s not usually an optimum solution to set this variable in other than containers. Since you run a single python process in containers, which does not spawn other python processes during its lifetime, then there is no disadvantages in doing that.

Delete

After all of the above, we may still encounter some cases to delete all of __pycache__ directories. Here is the shell command to remove them:

In the root of your project run:

Before running above delete command test it running without xargs rm -rf -the remove part. This will simply list them:

You can add this command as an alias for easy access:

Also you would like to add it as a make targets in your project’s Makefile for ci/cd pipelines.

Lastly there is a python library just for this delete cause: pyclean. In my opinion it’s an overkill but you may take a look.

Solution — New

This is my current solution about the issue.

Beginning from Python 3.8 you can use PYTHONPYCACHEPREFIX environment variable to use a global cache directory to avoid them created under your projects:

By this environment variable Python won’t create any __pycache__ directory in your projects, instead it will put all of them under

PYTHONPYCACHEPREFIX

If this is set, Python will write .pyc files in a mirror directory tree at this path, instead of in pycache directories within the source tree. This is equivalent to specifying the -X pycache_prefix=PATH option.

New in version 3.8.

You can add this environment variable in one of the configuration file of yours: .zshrc , .zshenv , .bashrc , .bash_profile , .profile , .zprofile etc.

Removing __pycache__ from git repository

How can I remove all __pycache__ subdirectories in my repository using .gitignore ?

2 Answers 2

You cannot remove files from existing commits: those commits are frozen for all time. You can make sure you do not add new files to future commits, though. Simply remove the files now, with git rm -r —cached __pycache__ , and list __pycache__ or __pycache__/ in your .gitignore (creating this .gitignore file if needed). Do this for each __pycache__ directory; use your OS’s facilities to find these (e.g., find . -name __pycache__ -type d ). Then git add .gitignore and git commit to commit the removal.

Note that any time anyone moves from any commit that has the files—where they’ll be checked out—to a commit that lacks the files, they will have their entire __pycache__ directory removed if Git is able to do that; at the least, any cached files that were committed and can be removed will be removed. So the —cached in the git rm -r —cached above only speeds things up for you by avoiding the removal of the cached compiled files this time. Others will have to rebuild their cache.

To make a new and different repository in which the __pycache__ files were ever accidentally committed in the first place, use git filter-branch (which is now deprecated) or the newfangled git filter-repo (which is not yet distributed with Git). Or, see any of these various existing questions and their answers, which you should already have found before you asked this:

‘Python3 project remove __pycache__ folders and .pyc files

What is the BEST way to clear out all the __pycache__ folders and .pyc/.pyo files from a python3 project. I have seen multiple users suggest the pyclean script bundled with Debian, but this does not remove the folders. I want a simple way to clean up the project before pushing the files to my DVS.

Solution 1: [1]

I found the answer myself when I mistyped pyclean as pycclean:

Running py3clean . cleaned it up very nicely.

Solution 2: [2]

You can do it manually with the next command:

https://amdy.su/wp-admin/options-general.php?page=ad-inserter.php#tab-8

This will remove all .pyc and .pyo files as well as __pycache__ directories recursively starting from the current directory.

Solution 3: [3]

macOS & Linux

BSD’s find implementation on macOS is different from GNU find — this is compatible with both BSD and GNU find. Start with a globbing implementation, using -name and the -o for or — Put this function in your .bashrc file:

Then cd to the directory you want to recursively clean, and type pyclean .

GNU find-only

This is a GNU find, only (i.e. Linux) solution, but I feel it’s a little nicer with the regex:

Any platform, using Python 3

On Windows, you probably don’t even have find . You do, however, probably have Python 3, which starting in 3.4 has the convenient pathlib module:

The -B flag tells Python not to write .pyc files. (See also the PYTHONDONTWRITEBYTECODE environment variable.)

The above abuses list comprehensions for looping, but when using python -c , style is rather a secondary concern. Alternatively we could abuse (for example) __import__ :

Critique of an answer

The top answer used to say:

This would seem to be less efficient because it uses three processes. find takes a regular expression, so we don’t need a separate invocation of grep . Similarly, it has -delete , so we don’t need a separate invocation of rm —and contrary to a comment here, it will delete non-empty directories so long as they get emptied by virtue of the regular expression match.

From the xargs man page:

Find files named core in or below the directory /tmp and delete them, but more efficiently than in the previous example (because we avoid the need to use fork(2) and exec(2) to launch rm and we don’t need the extra xargs process).

Solution 4: [4]

Since this is a Python 3 project, you only need to delete __pycache__ directories — all .pyc / .pyo files are inside them.

or its simpler form,

which didn’t work for me for some reason (files were deleted but directories weren’t), so I’m including both for the sake of completeness.

Alternatively, if you’re doing this in a directory that’s under revision control, you can tell the RCS to ignore __pycache__ folders recursively. Then, at the required moment, just clean up all the ignored files. This will likely be more convenient because there’ll probably be more to clean up than just __pycache__ .

Solution 5: [5]

If you need a permanent solution for keeping Python cache files out of your project directories:

Starting with Python 3.8 you can use the environment variable PYTHONPYCACHEPREFIX to define a cache directory for Python.

From the Python docs:

If this is set, Python will write .pyc files in a mirror directory tree at this path, instead of in pycache directories within the source tree. This is equivalent to specifying the -X pycache_prefix=PATH option.

Example

If you add the following line to your ./profile in Linux:

Python won’t create the annoying __pycache__ directories in your project directory, instead it will put all of them under

Solution 6: [6]

The command I’ve used:

find . -type d -name «__pycache__» -exec rm -r <> +

First finds all __pycache__ folders in current directory.

Execute rm -r <> + to delete each folder at step above ( <> signify for placeholder and + to end the command)

Edited 1:

I’m using Linux, to reuse the command I’ve added the line below to the

Edited 2: If you’re using VS Code, you don’t need to remove __pycache__ manually. You can add the snippet below to settings.json file. After that, VS Code will hide all __pycache__ folders for you

Solution 7: [7]

This is my alias that works both with Python 2 and Python 3 removing all .pyc .pyo files as well __pycache__ directories recursively.

Solution 8: [8]

I’m running Python3 and pip3 on a Mac. For me, the solution was as follows (In the root directory of my project):

I’d like to emphasize, since I see many answers which involve bash scripting, it is best practice in software to favour tested solutions to problems (which is exactly what an established python package represents) over a hand-rolled approach.

Solution 9: [9]

Using PyCharm

To remove Python compiled files

In the Project Tool Window , right-click a project or directory, where Python compiled files should be deleted from.

On the context menu, choose Clean Python compiled files .

The .pyc files residing in the selected directory are silently deleted.

Solution 10: [10]

From the project directory type the following:

Deleting all .pyc files

find . -path «*/*.pyc» -delete

Deleting all .pyo files:

find . -path «*/*.pyo» -delete

Finally, to delete all ‘__pycache__’, type:

find . -path «*/__pycache__» -type d -exec rm -r <> ‘;’

If you encounter permission denied error, add sudo at the begining of all the above command.

Solution 11: [11]

There is a nice pip package.

Solution 12: [12]

Thanks a lot for the other answers, based on them this is what I used for my Debian package’s prerm file:

Solution 13: [13]

Empty the directories first and then remove them:

Solution 14: [14]

Why not just use rm -rf __pycache__ ? Run git add -A afterwards to remove them from your repository and add __pycache__/ to your .gitignore file.

Solution 15: [15]

As simple as it can be

This is what I use for my projects. You can give it a try.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Python3 project remove __pycache__ folders and .pyc files

I found the answer myself when I mistyped pyclean as pycclean:

Running py3clean . cleaned it up very nicely.

macOS & Linux

BSD’s find implementation on macOS is different from GNU find — this is compatible with both BSD and GNU find. Start with a globbing implementation, using -name and the -o for or — Put this function in your .bashrc file:

Then cd to the directory you want to recursively clean, and type pyclean .

GNU find-only

This is a GNU find, only (i.e. Linux) solution, but I feel it’s a little nicer with the regex:

Any platform, using Python 3

On Windows, you probably don’t even have find . You do, however, probably have Python 3, which starting in 3.4 has the convenient pathlib module:

The -B flag tells Python not to write .pyc files. (See also the PYTHONDONTWRITEBYTECODE environment variable.)

The above abuses list comprehensions for looping, but when using python -c , style is rather a secondary concern. Alternatively we could abuse (for example) __import__ :

Critique of an answer

The top answer used to say:

This would seem to be less efficient because it uses three processes. find takes a regular expression, so we don’t need a separate invocation of grep . Similarly, it has -delete , so we don’t need a separate invocation of rm —and contrary to a comment here, it will delete non-empty directories so long as they get emptied by virtue of the regular expression match.

From the xargs man page:

Find files named core in or below the directory /tmp and delete them, but more efficiently than in the previous example (because we avoid the need to use fork(2) and exec(2) to launch rm and we don’t need the extra xargs process).

You can do it manually with the next command:

This will remove all .pyc and .pyo files as well as __pycache__ directories recursively starting from the current directory.

Since this is a Python 3 project, you only need to delete __pycache__ directories — all .pyc / .pyo files are inside them.

or its simpler form,

which didn’t work for me for some reason (files were deleted but directories weren’t), so I’m including both for the sake of completeness.

Alternatively, if you’re doing this in a directory that’s under revision control, you can tell the RCS to ignore __pycache__ folders recursively. Then, at the required moment, just clean up all the ignored files. This will likely be more convenient because there’ll probably be more to clean up than just __pycache__ .

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *