Monday, November 20, 2017

URLs on all 'top million sites' lists

Alexa 1 million, Statvoo 1 million, OpenDns 1 million, Majestic 1 million, quantcast 1 million:
All "top million websites" have slightly different formats, but all have many domains by amount of traffic - just how they are selected varies.
Filtered for a list of unique urls, then added http://www at the beginning, checked if this gives a 200 OK.

Starting with over 4 million urls, only  34,000 are on all lists (when checked as above):

Here's the list for download, no warranty, promises, absolutely at your own risk. Re-running this might yield different results to changes in the original list, timeouts, etc.

I'll use this list for a while to run a bunch more queries.

