Intoli Blog

Intoli Smart Proxies

Want to use the smartest web scraping proxies available?

Get started now and find out why Intoli is the best in the business!

Intoli Joins the NVIDIA Inception Program

Intoli is Joining the NVIDIA Family We’re very pleased to announce today that Intoli will officially be joining the NVIDIA Inception Program for exceptional technology startups who are revolutionizing their industries with advances in artificial intelligence (AI) and data science. NVIDIA has been instrumental in the resurgence of neural networks in machine learning over the last several years. The rise of GPU-accelerated neural network training has allowed for major advances in the field of deep learning and NVIDIA’s GPU lines, Deep Learning SDK, and investment in AI startups have all undoubtedly played an immense role in that.

Continue reading

Running Selenium with Headless Firefox

Update: This article is regularly updated in order to accurately reflect improvements in Firefox’s headless browsing capabilities. Note: Check out Running Selenium with Healdess Chrome if you’d rather use Google’s browser. Using Selenium with Headless Firefox (on Windows) Ever since Chrome implemented headless browsing support back in April, the other major browsers started following suit. In particular, Mozilla has since then expanded support for Firefox’s headless mode from Linux to its Windows and macOS builds, and fixed a number of bugs that might have been in the way of early adopters.

Continue reading

Finding Pareto Optimal Blogs on Hacker News

Introduction I’ve been doing a lot of technical writing recently and, with that experience, I’ve grown to more deeply appreciate the writing of others. It’s easy to take the effort behind an article for granted when you’ve grown accustomed to there being new high-quality content posted every day on Hacker News and Twitter. The truth is that a really good article can take days or more to put together and it isn’t easy to write even one article that really takes off, let alone a steady stream of them.

Continue reading

Why I still don't use Yarn

But Isn’t Yarn the Best Node Package Manager? If you’re only comparing it to npm, then the answer is unequivocally yes. Yarn is generally much faster than npm and gives you deterministic builds by default, built-in integrity checking, license management tools, and a host of other goodies. Despite all of that, I still usually don’t use yarn. I avoid yarn for one simple reason: disk space usage. I feel like a bit of a curmudgeon here, but I find it a little absurd that it can easily take 100 MB, or more, to store a project consisting of a couple hundred lines of JavaScript if you want to use modern tooling (e.

Continue reading

The tech videos that have most impacted me as a developer

Introduction Over the years, I’ve collected a handful of videos that I deeply enjoy and that have had a significant impact on me as a developer. These are videos that I love introducing people to and I’m happy to have the chance to share them with you here. I find them all inspirational in their own ways and they serve as a continuous reminder for me to keep an open mind and to take creative approaches to problems.

Continue reading

Predicting Hacker News article success with neural networks and TensorFlow

Hacker News Title Tool Enter a potential title for a Hacker News submission below to see how likely it is to succeed or to be flagged dead. Once you play around a bit you can read on to learn how exactly these predictions are made. Background Submitting an article to Hacker News can be a little stressful if you’ve invested a lot of time in writing it. An article’s success really hinges upon getting the initial four or five votes that will push it on to the front page where it can reach a broader audience.

Continue reading

Email Spy: A new open source browser extension for lead generation

Introduction Lead generation is a top priority for most successful companies and helping businesses find potential clients is a big part of what we do here at Intoli. Today, we’re pleased to announce a new open source marketing tool that makes it possible to find contact emails for any web domain with a single click. It’s called Email Spy and you can get the source on GitHub or install it directly as a Chrome extension or a Firefox addon.

Continue reading

Running Selenium with Headless Chrome

UPDATE: This article is updated regularly to reflect the latest information and versions. If you’re looking for instructions then skip ahead to see Setup Instructions. NOTE: Be sure to check out Running Selenium with Headless Chrome in Ruby if you’re interested in using Selenium in Ruby instead of Python. Background It has long been rumored that Google uses a headless variant of Chrome for their web crawls. Over the last two years or so it had started looking more and more like this functionality would eventually make it into the public releases and, as of this week, that has finally happened.

Continue reading

How to Create a Public Slack Community with Open Invites

We recently created a public Slack community dedicated to web scraping in order to provide a general forum for people to discuss topics related to browser automation, headless browsers, scraping frameworks, data pipelining, or anything else along those lines. We wanted it to be open to anyone who wanted to join, but Slack unfortunately doesn’t really provide any sort of open-access Slack communities or channels. If you want to make your Slack community open to anybody, then your options are to either send invitations to anyone who expresses interest, or to generate shared invite URLs which expire after four weeks.

Continue reading

Why Python's for-else Clause Makes Perfect Sense, but You Still Shouldn't Use It

An interesting (and somewhat obscure) feature of Python is being able to attach an else block to a loop. The basic idea is that the code in the else block runs only if the loop completes without encountering a break statement. Here’s a trivial example in the form of a password guessing game: for i in range(3): password = input('Enter password: ') if password == 'secret': print('You guessed the password!

Continue reading