By Andre Perunicic | August 24, 2017
This post explains how to automatically clear Jupyter or iPython notebook output cells every time you commit or switch branches in a particular git repo. Enabling this in a repository allows for easier collaboration on Jupyter notebooks and prevents the repository’s size from ballooning due to embedded plots and data printouts!
-
Clone the script (into, say,
~/GitHub) from https://github.com/toobaz/ipynb_output_filter viagit clone https://github.com/toobaz/ipynb_output_filter.git -
Make the script executable
chmod +x ~/GitHub/ipynb_output_filter/ipynb_output_filter.py -
Create a
~/.gitattributesfiletouch ~/.gitattributesand add the following hook:
echo "*.ipynb filter=dropoutput_ipynb" >> ~/.gitattributes -
Configure git to find your
~/.gitattributesfile:git config --global core.attributesfile ~/.gitattributes -
Add the
dropout_ipynbsection (that will trigger the script above) to your repo’s git config by pasting the following intopath/to/repo/.git/config:
[filter "dropoutput_ipynb"]
clean = ~/GitHub/ipynb_output_filter/ipynb_output_filter.py
smudge = cat
From then on, committing any *.ipynb file should lead to the output being stripped.
If you found this post from Google, I hope this solved your problem. Don’t hesitate to get in touch with us if you need help setting up a data sourcing, analysis or reporting infrastructure.
Comments