By Andre Perunicic | January 27, 2018
If you use Selenium for automated testing or web scraping, you may have discovered that there is no built-in utility for clearing browser resources like cookies, cached scripts, and objects in local storage. This is not particularly surprising given that the WebDriver specification that Selenium uses behind the scenes has no provision for clearing the cache. However, lingering cached resources can cause your tests to pass when they shouldn’t, prevent your scrapers from quickly starting clean sessions on demand, and cause all sorts of undesirable behavior besides. Fortunately, there’s still a way out! In this article I’ll describe how to clear the Firefox browser cache with Selenium. The code will be written in Python, but you should be able to adapt it to other languages without much difficulty.
If all you’re interested in is the end-result, take a look at the finished utility in Intoli’s article code repository. The rest of the article will describe what this utility actually does. The technique is quite similar to the one used to clear the Chrome browser cache, also published on our blog, so head on over there if you prefer using Chrome with Selenium.
Clearing the Cache
We’ll clear the cache by emulating how a human would accomplish the task: by visiting Firefox’s preferences page and going through the appropriate UI interactions.
The options we care about are in the Privacy & Security section of the Preferences page which you can access by visiting
about:preferences#privacy in Firefox.
The controls we need are under “Cached Web Content” and “Site Data.”
Our script needs to click the “Clear Now” and “Clear All Data” buttons that are highlighted in red in the screenshot above. The latter launches a standard alert dialog confirming the data clearing action which we also have to click through.
The interface of the Preferences page is coded in Mozilla’s XML-based XUL interface-building language.
While right-clicks are not available and we cannot examine the target buttons directly, it’s rather straightforward to pull up Firefox’s developer tools, and locate the buttons we need in the code.
We see that the two buttons we need to press have ids
The script itself is a standard Selenium affair.
In your project’s directory, say
clear_cache.py with the following contents:
from selenium.webdriver.common.alert import Alert from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.ui import WebDriverWait def get_clear_cache_button(driver): return driver.find_element_by_css_selector('#clearCacheButton') def get_clear_site_data_button(driver): return driver.find_element_by_css_selector('#clearSiteDataButton') def clear_firefox_cache(driver, timeout=10): driver.get('about:preferences#privacy') wait = WebDriverWait(driver, timeout) # Click the "Clear Now" button under "Cached Web Content" wait.until(get_clear_cache_button) get_clear_cache_button(driver).click() # Click the "Clear All Data" button under "Site Data" and accept the alert wait.until(get_clear_site_data_button) get_clear_site_data_button(driver).click() wait.until(EC.alert_is_present()) alert = Alert(driver) alert.accept()
wait.until method will pass
driver to its argument function until it returns a non-
False value, so it’s convenient to get at the buttons using functions that accept
driver as the only argument.
This makes it easy to guarantee the button is available for clicking, which in this case happens almost immediately anyway.
Test-Driving the Utility
Let’s verify that this works as expected by visiting a site which uses a liberal amount of caching, and then clearing the cache via the method above.
The script in this section visits
overstock.com, goes to Firefox’s Privacy and Security page where it lets you examine the settings for 10 seconds, then clears the cache and waits another 10 seconds before quitting.
Installing Selenium and geckodriver
To run this script, you need to have Selenium installed and the appropriate version of geckodriver running. You can install geckodriver by downloading the binaries from their releases page, or if your’re on Linux, using your distribution’s package manager.
To install Selenium, you can just use
I like to work in a virtualenv, which you can create and activate with
mkdir clear-firefox-cache cd clear-firefox-cache virtualenv env . env/bin/activate
Install Selenium with
pip install selenium and start geckodriver in a new terminal with
In the same directory as
clear_cache.py, create a script named
evaluate-clear-cache.py with the following contents.
from time import sleep from selenium import webdriver from clear_cache import clear_firefox_cache # Start a firefox driver (make sure that geckodriver is running first) driver = webdriver.Firefox() # Visit a website that places data in local storage driver.get('https://overstock.com') # Stay at preferences page for 10 seconds to see the state driver.get('about:preferences#privacy') sleep(10) # Clear the cache and hang around some more clear_cache(driver) sleep(10) driver.quit()
As long as
geckodriver is running and you have Selenium installed, you can run the script with
A Selenium-driven Firefox window will pop up, and you should be able to see the size of the cached resources going to 0 after they are cleared. Note that this clears all cookies, as you can see by clicking the “remove individual cookies” link in the History section.
Turning Off the Cache Completely
As a side note, note that it’s possible to completely turn off caching from the get-go by providing Selenium with a customized Firefox profile.
You can get the list of relevant preferences by browsing
about:config in the version of Firefox you intend to use.
Then, you can manually construct a profile with the desired customizations.
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile from selenium import webdriver profile = FirefoxProfile() profile.set_preference('browser.cache.disk.enable', False) profile.set_preference('browser.cache.memory.enable', False) profile.set_preference('browser.cache.offline.enable', False) profile.set_preference('network.cookie.cookieBehavior', 2) driver = webdriver.Firefox(firefox_profile=profile)
2 for the value of
network.cookie.cookieBehavior is equivalent to never accepting cookies.
Hopefully this post helped you to clear the Firefox cache and cookies when using Selenium. We specialize in automation here at Intoli, and offer consulting services, so don’t hesitate to get in touch. We often write about interesting technological posts, so check out our blog and consider subscribing to our mailing list, where we share our latest and greatest.
If you enjoyed this article, then you might also enjoy these related ones.
A guide to setting up a practical proxy API on Amazon's Lambda using Node.js and Express.
An updated example of techniques to avoid detection.