By Andre Perunicic | August 24, 2017
This post explains how to automatically clear Jupyter or iPython notebook output cells every time you commit or switch branches in a particular git repo. Enabling this in a repository allows for easier collaboration on Jupyter notebooks and prevents the repository’s size from ballooning due to embedded plots and data printouts!
-
Clone the script (into, say,
~/GitHub
) from https://github.com/toobaz/ipynb_output_filter viagit clone https://github.com/toobaz/ipynb_output_filter.git
-
Make the script executable
chmod +x ~/GitHub/ipynb_output_filter/ipynb_output_filter.py
-
Create a
~/.gitattributes
filetouch ~/.gitattributes
and add the following hook:
echo "*.ipynb filter=dropoutput_ipynb" >> ~/.gitattributes
-
Configure git to find your
~/.gitattributes
file:git config --global core.attributesfile ~/.gitattributes
-
Add the
dropout_ipynb
section (that will trigger the script above) to your repo’s git config by pasting the following intopath/to/repo/.git/config
:
[filter "dropoutput_ipynb"]
clean = ~/GitHub/ipynb_output_filter/ipynb_output_filter.py
smudge = cat
From then on, committing any *.ipynb
file should lead to the output being stripped.
If you found this post from Google, I hope this solved your problem. Don’t hesitate to get in touch with us if you need help setting up a data sourcing, analysis or reporting infrastructure.
Comments