Tuesday, July 23, 2013

Google Plus links count as backlinks

There is an ongoing discussion on Social Media and SEO. Currently promoting the synergies of these at Dell, no wonder there are some insights.

One of the common questions is:

Does Social media have an influence on natural search rankings - Answer: Yes

And there are several ways I can 'prove' that. So the question many still ask - correlation or causation, can be answered: Both!

First things first, let me show you one screen which proves the connection. Do you use Google webmaster tools? For SEO folks, that's a standard. And in that tool are backlink reports. As backlinks are considered to have one of the strongest influences on rankings, it is a standard for seo to look at these.

This is how you get there:
On the following page, select 'download latest' links from others to your site.
And then search in the results for plus.google.com as the referring url:


Voila! Clear proof that Google sees these links just as regular backlinks and tracks them in GWT.
Like with many other links, it is not possible to see HOW MUCH influence one link has - and it is not for lack of trying - but I would consider this enough  of a proof that it does count for search engine results page rankings.

Google shows these since roughly a year I would say. Now my hope would be, that Google easily and quickly identifies Google Plus Link spammers and discredits their links, but I doubt they are there already.

As shown in profile - I work for Dell and we have a rather large site with the according number of backlinks from Google Plus and many other sites.

Would you count this as proof that social influences search rankings?

Monday, July 8, 2013

Pull urls from site for sitemap with wGet

Working for a large company we can use a lot of different tools to do our job. One thing I wanted to do is to build a sitemap for a site where the content management system does not provide this feature.

So, I started to check various tools, like screaming frog, sitebuilder. Xenu was not reliable last time I tried, and these two tools did not work as wished for as well, the site is relatively large. And while screaming frog is great and fast, it slows down very much after a few thousand urls.

Using linux at home, I quickly started my first trials with cURL and wget. Curl was ruled out quickly, so focusing on wget I tried a few things.

First, I just started with the root url, and then waited:

wget --spider --recursive --no-verbose --no-parent -t 3 -4 –save-headers --output-file=wgetlog.txt &

spider for only getting the urls, recursive with no-parent for the whole directory but nothing above, -t 3 for three trials to download a url, sending urls to the logfile.
Slowly but surely the list kept building. Added -4 after some research, as is was said to help speed up to force a IPv4 request.

Still very slow, so I tried to run this with xargs:
xxargs -n 1 -P 10 url-list.txt wget --spider --recursive --no-verbose --no-parent -t 2 -4 -save-headers --output-file=wgetlog.txt &

I did not really see an improvement - just plain 'feeling' of time, but it was definitely still to slow to go through 10,000 + urls in a day.

After some research I came up with this solution, and it seems to work well enough:
I split the site into several sections, and then gathered the top ~ 10 urls in a textfile, which I used as input for a loop in a bash script (the # echo I use for testing the scripts, I am a pretty bloody beginner and this helps) :
#! bash
while read -r line; do
#echo $line
wget --spider --recursive --no-verbose --no-parent -t 3 -4 –save-headers --output-file=wgetlog$i.txt $line
#echo wgetlog$i.txt
done < urls.txt
In the wget line the $line stands for the input file into wget, it takes variables. Works well. I get a bunch of wgetlog files with different names with all the urls, and it sure seemed faster than xargs, although I read that xargs is better in distributing load.

Bookmark and Share