Check If A Website or URL Has Been Submitted to StumbleUpon

By Evan Sangaline | September 14, 2017

It can sometimes be a bit difficult to figure out whether a specific URL has been submitted to StumbleUpon yet because they don’t provide an easy way to search through their indexed sites. If you’re trying to figure out if your website–or a specific web page–has been submitted to StumbleUpon, then simply enter the URL into the widget below to fetch the latest information from StumbleUpon’s index.

StumbleUpon URL Checker

Submitted Views Title
     

How it Works

You don’t have to worry about the implementation details if you’re only interested in finding an easy way to quickly check the status of a URL in StumbleUpon’s index. The above tool should serve that purpose regardless of whether or not you know how it works under the surface. That said, I’ll go over the details here briefly for anyone who is interested in how it works.

If you’re familiar with StumbleUpon, then chances are that you’ve noticed their share buttons around the web.

StumbleUpon Share Button

These are often the official share badges provided by StumbleUpon, but StumbleUpon also provides an API for making your own custom badges. You can query the API by constructing URLs of the form http://www.stumbleupon.com/services/1.01/badge.getinfo?url=https://www.github.com, where the URL after url= is the web page that you would like to query. Their API will return a JSON response, similar to the following.

{
  "result": {
   "url": "https://github.com/",
    "in_index": true,
    "publicid": "2T2aNf",
    "views": 301,
    "title": "GitHub ยท Social Coding",
    "thumbnail": "http://cdn.stumble-upon.com/mthumb/837/17983837.jpg",
    "thumbnail_b": "http://cdn.stumble-upon.com/bthumb/837/17983837.jpg",
    "submit_link": "http://www.stumbleupon.com/badge/?url=https://github.com/",
    "badge_link": "http://www.stumbleupon.com/badge/?url=https://github.com/",
    "info_link": "http://www.stumbleupon.com/url/https%253A//github.com/"
  },
  "timestamp": 1505390337,
  "success": true
}

The result.in_index entry can be used to see whether or not the page has been submitted to StumbleUpon, result.views reveals the number of views, and result.title gives the page’s title on StumbleUpon. These are the three pieces of information that are used to populate the table in the StumbleUpon URL Checker tool at the top of this page.

The StumbleUpon Badge API makes it fairly easy to query the desired information, but there’s an additional complication if you would like to make the API request from a browser context. StumbleUpon has chosen not to enable Cross-Origin Resource Sharing (CORS) on their API and this means that requests made from web pages will result in the following error.

XMLHttpRequest cannot load https://www.stumbleupon.com/services/1.01/badge.getinfo?url=https://intoli.com.
No 'Access-Control-Allow-Origin' header is present on the requested resource.
Origin 'https://intoli.com' is therefore not allowed access.

This is an interesting choice on StumbleUpon’s part considering that the API is specifically designed to be used for share buttons on web pages. My guess would be that the purpose is to force people to implement the badge rendering server-side, where it will likely be cached, to avoid hammering their servers with every page view. In any case, it does make it harder to build a browser-based tool like the StumbleUpon URL Checker.

To get around the above error, it is necessary to proxy the request through a server that will add the Access-Control-Allow-Origin header to each response. This is a fairly common situation, so there are luckily a handfull of such proxies around that are easy to use. One nice one–which is free and open source–is CORS-Anywhere. The CORS-Anywhere proxy can be used by simply prepending the desired resource with https://cors-anywhere.herokuapp.com/.

Finally, we can put this all together into a few lines JavaScript to query the StumbleUpon Badge API from a browser context which would otherwise block the requests.

// the URL of the site that we would like to check
const siteUrl = 'https://intoli.com'
// combine it with the StumbleUpon Badge API URL prefix
const stumbleUponUrl `https://www.stumbleupon.com/services/1.01/badge.getinfo?url=${siteUrl}`
// prepend the CORS-Anywhere URL to add the CORS headers
const url = `https://cors-anywhere.herokuapp.com/{stumbleUponUrl}`

// setup the HTTP request and its response handler
const xhr = new XMLHttpRequest();
xhr.addEventListener('load', function() {
  // parse the JSON response
  const response = JSON.parse(xhr.responseText);
  // do something with the response...
  // ...
});

// make the request
xhr.open('GET', url);
xhr.send();

There’s obviously a bit more that goes into handling errors, debouncing requests, and interacting with the UI in the StumbleUpon URL Checker, but that’s the basic gist of it!

Conclusion

I hope that the StumbleUpon URL Checker was useful for you if you were trying to figure out whether a specific web page had been submitted to StumbleUpon. Or perhaps the technical details on proxying the API requests through CORS-Anywhere might be helpful for somebody trying to write their own client-side share buttons. Either way, thanks for reading and be sure to check out some other interesting articles on the Intoli blog!

Also, please do get in touch if you’re ever looking to work with top-tier consultants in the data science space. We would be happy to set up a free consultation and hear about what cool things you’ve been working on.

Suggested Articles

If you enjoyed this article, then you might also enjoy these related ones.

The Red Tide and the Blue Wave: Gerrymandering as a Risk vs. Reward Strategy

By Evan Sangaline
on November 6, 2018

An interactive explanation of how gerrymandering is a risky strategy that allows for the possibility of a blue wave.

Read more

Performing Efficient Broad Crawls with the AOPIC Algorithm

By Andre Perunicic
on September 16, 2018

Learn how to estimate page importance and allocate bandwidth during a broad crawl.

Read more

User-Agents โ€” Generating random user agents using Google Analytics and CircleCI

By Evan Sangaline
on August 30, 2018

A free dataset and JavaScript library for generating random user agents that are always current.

Read more

Comments