By Andre Perunicic | August 24, 2017
This post explains how to automatically clear Jupyter or iPython notebook output cells every time you commit or switch branches in a particular git repo. Enabling this in a repository allows for easier collaboration on Jupyter notebooks and prevents the repository’s size from ballooning due to embedded plots and data printouts!
Clone the script (into, say,
~/GitHub) from https://github.com/toobaz/ipynb_output_filter via
git clone https://github.com/toobaz/ipynb_output_filter.git
Make the script executable
chmod +x ~/GitHub/ipynb_output_filter/ipynb_output_filter.py
and add the following hook:
echo "*.ipynb filter=dropoutput_ipynb" >> ~/.gitattributes
Configure git to find your
git config --global core.attributesfile ~/.gitattributes
dropout_ipynbsection (that will trigger the script above) to your repo’s git config by pasting the following into
[filter "dropoutput_ipynb"] clean = ~/GitHub/ipynb_output_filter/ipynb_output_filter.py smudge = cat
From then on, committing any
*.ipynbfile should lead to the output being stripped.
If you found this post from Google, I hope this solved your problem. Don’t hesitate to get in touch with us if you need help setting up a data sourcing, analysis or reporting infrastructure.