Tuesday, December 17, 2013

Keyword - check: ranking 100 urls in search engine

Do you ever need to check for a keyword or two which pages rank on Google, and find the results hard to read, and especially cumbersome to copy for further use? At Dell we use large scale tools like seoclarity , and get tons of data in high quality, and I still sometimes have these one-off requests, where I need a small tool, NOW.

It is a small script for linux bash (and thus should run with slight modifications also on cygwin on windows computers).

First I call the script with two parameters - the search engine, then the search term, as in
# . script.sh www.searchengine.com "searchterm" . Search terms with spaces work, just replace the space with a '+'.
That's used to build the url, which then is used with curl to pull the results from Google.  Xidel is a small bash program with super-easy use to use xpath to filter content.

# $1 is query url, $2 is the search term, skipping the check if both are given for shortness
url="http://${1}/search?q=${2}&sourceid=chrome&ie=UTF-8&pws=0&gl=us&num=100"

curl -s -A "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" -o temp.file "$url" 
xidel temp.file -e "//cite" > urls-from-curlgoogle.csv 
rm temp.file 

Thanks to Benito for Xidel and for helpful input to fix my variable assignment thanks to  +Norbert Varzariu , +Alexander Skwar.

Again, this is not to replace any of the excellent tools out there, free or paid, but to accommodate small tasks, 'quick and dirty' and with some accuracy but not too much handling. And for sure please handle this responsibly, not spamming any search engine and not disregarding any terms of use. I checked when I tried this, and the current ToS seem not to prevent this - but I am no lawyer and might be mistaken. At your own risk.

No comments:

Post a Comment

Bookmark and Share