Intoli Blog

Intoli Smart Proxies

Want to use the smartest web scraping proxies available?

Get started now and find out why Intoli is the best in the business!

Check If A Website or URL Has Been Submitted to StumbleUpon

It can sometimes be a bit difficult to figure out whether a specific URL has been submitted to StumbleUpon yet because they don’t provide an easy way to search through their indexed sites. If you’re trying to figure out if your website–or a specific web page–has been submitted to StumbleUpon, then simply enter the URL into the widget below to fetch the latest information from StumbleUpon’s index. #url-checker input { margin-bottom: 10px; width: 100%; } #warning-message.

Continue reading

Fantasy Football for Hackers

There’s a First Time for Everything Like some 75 million other Americans, I am playing fantasy football this year. Unlike most of them, I know virtually nothing about football. I would estimate that I’ve watched somewhere around five games total in my life, most of them Super Bowls. I don’t know the rules beyond the very basics and I can’t name a single NFL player off the top of my head.

Continue reading

Installing Google Chrome On CentOS, Amazon Linux, or RHEL

The easiest way to install the latest Chrome version on RHEL, CentOS, and Amazon Linux versions 6.X and 7.X. # This installs Chrome on any RHEL/CentOS/Amazon Linux variant. curl https://intoli.com/install-google-chrome.sh | bash A Universal Installation Script for Google Chrome on Amazon Linux and CentOS 6 CentOS, Amazon Linux AMI, and Red Hat Enterprise Linux are three closely related GNU/Linux distributions which are all popular choices for server installations. They offer excellent performance and stability, but package availability can often be lacking.

Continue reading

How Are Principal Component Analysis and Singular Value Decomposition Related?

Introduction Principal Component Analysis, or PCA, is a well-known and widely used technique applicable to a wide variety of applications such as dimensionality reduction, data compression, feature extraction, and visualization. The basic idea is to project a dataset from many correlated coordinates onto fewer uncorrelated coordinates called principal components while still retaining most of the variability present in the data. Singular Value Decomposition, or SVD, is a computational method often employed to calculate principal components for a dataset.

Continue reading

Scraping User-Submitted Reviews from the Steam Store

This article was originally published as a guest post on ScrapingHub’s blog. ScrapingHub is the company that wrote Scrapy, which this article is about, so read on to see why they liked it! Introduction The Steam game store is home to more than ten thousand games and just shy of four million user-submitted reviews. While all kinds of Steam data are available either through official APIs or other bulk-downloadable data dumps, I could not find a way to download the full review dataset.

Continue reading

Making Chrome Headless Undetectable

Detecting Headles Chrome A short article titled Detecting Chrome Headless popped up on Hacker News over the weekend and it has since been making the rounds. Most of the discussion on Hacker News was focused around the author’s somewhat dubious assertion that web scraping is a “malicious task” that belongs in the same category as advertising fraud and hacking websites. That’s always a fun debate to get into, but the thing that I really took issue with about the article was that it implicitly promoted the idea of blocking users based on browser fingerprinting.

Continue reading

Markov's and Chebyshev's Inequalities Explained

Confidence Values If you’ve ever learned any basic statistics or probability then you’ve probably encountered the 68-95-99.7 rule at some point. This rule is simply the statement that, for a normally distributed variable, roughly 68% of values will fall within one standard deviation of the mean, 95% of values within two standard deviations, and 99.7% within three standard deviations. These confidence values are quite useful to memorize because values that are computed from data are often approximately normally distributed due to the central limit theorem.

Continue reading

Patching a Linux Kernel Module

A Bug on Linux? Why, I never! I’ve been using GNU/Linux for about fifteen years and, I’ve got to admit, it used to be pretty rough around the edges (to put it lightly). A lot can change over fifteen years though; most of the things that were once major problem areas haven’t required a second thought in years. Laptop suspension, WIFI, advanced function keys, sound, and pretty much everything else all typically “just work” these days, and this has been the case for quite a while.

Continue reading

Understanding Neural Network Weight Initialization

Choosing Weights: Small Changes, Big Differences There are a number of important, and sometimes subtle, choices that need to be made when building and training a neural network. You have to decide which loss function to use, how many layers to have, what stride and kernel size to use for each convolution layer, which optimization algorithm is best suited for the network, etc. With so many things that need to be decided, the choice of initial weights may, at first glance, seem like just another relatively minor pre-training detail, but weight initialization can actually have a profound impact on both the convergence rate and final quality of a network.

Continue reading

Intoli Joins the NVIDIA Inception Program

Intoli is Joining the NVIDIA Family We’re very pleased to announce today that Intoli will officially be joining the NVIDIA Inception Program for exceptional technology startups who are revolutionizing their industries with advances in artificial intelligence (AI) and data science. NVIDIA has been instrumental in the resurgence of neural networks in machine learning over the last several years. The rise of GPU-accelerated neural network training has allowed for major advances in the field of deep learning and NVIDIA’s GPU lines, Deep Learning SDK, and investment in AI startups have all undoubtedly played an immense role in that.

Continue reading