By Andre Perunicic | June 22, 2017
Using Selenium with Headless Firefox (on Windows)
Selenium uses the WebDriver protocol to remotely control browsers. Chrome, Firefox, Safari, and other major browsers already work with this API, and support for headless browsing has been improving ever since Google implemented it in Chrome back in April. As of today, Firefox expanded its headless mode from Linux to its nightly Windows builds. macOS support is lagging behind a bit, but should be coming pretty soon too.
This is obviously pretty cool from the automated testing and and web scraping perspectives, so in this article I will describe how to connect Selenium WebDriver to Firefox’s new headless mode. I will primarily explain things for users of Windows, but you should be able to follow along on other operating systems with some minor modifications.
Let’s get started by installing all the requirements.
First, download and install Firefox Nightly from Mozilla’s website.
You will also need geckodriver, the layer used for connecting Selenium and Firefox, which has download links included on its GitHub releases page.
Once downloaded, extract the package and place it somewhere in your
For example, if you place
C:\bin\ you can ensure it is in your user’s
Path by running
[Environment]::SetEnvironmentVariable("Path", "$env:Path;C:\bin\", "User")
from the powershell.
While you’re at it, make sure that python is also in your
[Environment]::SetEnvironmentVariable("Path", "$env:Path;C:\Python27\;C:\Python27\Scripts\", "User")
Aleternatively, you can perform these steps through the GUI by searching for “Path” from the start menu and navigating through the “Edit environment variables for your account” settings panel.
The required binaries should now be visible, so start a command prompt with
cmd and install virtualenv:
pip install virtualenv
Then, create a new project and install selenium
mkdir selenium-firefox cd selenium-firefox virtualenv env env\Scripts\activate pip install selenium
The final step before actually using selenium for driving headless Firefox is to start geckodriver in the background.
Open a new
cmd and run
Connecting Selenium to Headless Firefox
Before we perform any tests, let’s make sure we can connect to headless Firefox in the first place.
Create a new script called
The first task is to instruct selenium to use the correct Firefox binary in headless mode.
Since we are using the nightly build, that can be done as follows:
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary from selenium import webdriver binary = FirefoxBinary('C:\\Program Files\\Nightly\\firefox.exe', log_file=sys.stdout) driver = webdriver.Firefox(firefox_binary=binary)
To make sure this works, let’s visit the Email Spy: A New Open Source Browser Extension for Lead Generation article page on this website and pull out text from its heading:
# Visit a website. driver.get("https://intoli.com/blog/email-spy/") # Grab the heading element. heading_element = driver.find_element_by_xpath('//*[@id="heading-breadcrumbs"]')
This gives us the element, so let’s extract and clean the Firefox-specific
textContent property for the text:
if heading_element: print(heading_element.get_property('textContent').strip()) else: print("Heading element not found!")
Before running, close up the connection:
Finally, test out the connection by running the script. Typically we would run Firefox in headless mode through something like
but that still seems to have no effect.
Instead, we need to set the
MOZ_HEADLESS environment variable before executing the python script.
set MOZ_HEADLESS=1 python test-intoli.com
After a while we’ll see that the script ran successfully:
Email Spy: A new open source browser extension for lead generation
One curiosity is that omitting the
log_file parameter in
FirefoxBinary can get Firefox stuck!
If you have floating Firefox processes, you can kill them all easily with
taskkill /im firefox.exe /f
Driving a Unit Test with Selenium and Headless Firefox
Let’s round out the tutorial by creating a non-trivial unit test for this very site. In particular, we will test the mailing list subscription box that is shown to first-time readers of our blog.
We try to be respectful towards our readers by only displaying this advertisment once. That is, if you sign up for the mailing list or dismiss the invitation box, you shouldn’t see it again on any other post. To make sure that all of this works as expected, our test will execute the following steps:
- Scroll far enough down a blog post for the subscription box to show up and try to grab it from the page.
- Visit a different article and make sure that the ad is not displayed at all this time.
We’ll be making use of the standard library’s unittest module to actually implement these steps.
Start by moving creating a placeholder
import unittest from selenium import webdriver from selenium.webdriver.firefox.firefox_binary import FirefoxBinary class MailingListTest(unittest.TestCase): def setUp(self): binary = FirefoxBinary('C:\\Program Files\\Nightly\\firefox.exe', log_file=sys.stdout) self.driver = webdriver.Firefox(firefox_binary=binary) def test_two_visits(self): self.fail("There is nothing here!") def tearDown(self): self.driver.close() if __name__ == '__main__': unittest.main()
You can run this test with the following command at any time, thanks to the last two lines in the file.
tearDown methods are self-explanatory, and deal with managing the connection to the browser.
The meat should be within the
test_two_visits method so let’s build it up.
All the code below should live within that method.
Start by giving
self.driver a convenient local reference.
driver = self.driver
Then, clear cookies to ensure a clean start and then head over to the Email Spy post
At this point we start polling our element until it’s found for at most 10 seconds. This is done by instructing the driver to “implicitly wait.” If the element is not obtained after 10 seconds the driver raises an exception and the test fails.
driver.implicitly_wait(10) try: driver.find_element_by_id("PopupSignupForm_0") except: self.fail("Could not find element for 10s the first time. :(") else: print("Found element the first time! :)")
Running the test at this point should show that we grabbed the element successfully.
Next we visit a different post (you can actually visit the same one again) and only fail when there is no exception from
driver.get("https://intoli.com/blog/running-selenium-with-headless-chrome/") driver.execute_script("window.scrollTo(0, document.body.scrollHeight*0.8);") try: driver.find_element_by_id("PopupSignupForm_0") except: print("Did *not* find element the second time! :)") else: self.fail("Found element the second time! :(")
Running the test with
python test-intoli.py finally produces the desired result:
> python test-intoli.py INFO:MailingListTest.test_two_visits:Found element the first time! :) INFO:MailingListTest.test_two_visits:Did *not* find element the second time! :) . ---------------------------------------------------------------------- Ran 1 test in 26.063s OK
Check out the complete example script.
This short guide on getting started using Selenium and headless Firefox just scratches the surface of what’s possible. Stay tuned for other cool content from our blog and if you need us to do advanced website testing or headless scraping don’t hesitate to get in touch.