How common are tags like opengraph, schema and rel publisher?
These are interesting, perhaps important features of a website, not just, but also for seo. What better than to take a look at a larger number of sites, and to check if they use these tags.
This is the output of a little script to test for these three tags (schema.org, opengraph.org, rel_publisher for G+ ) on a list of urls.
These are interesting, perhaps important features of a website, not just, but also for seo. What better than to take a look at a larger number of sites, and to check if they use these tags.
This is the output of a little script to test for these three tags (schema.org, opengraph.org, rel_publisher for G+ ) on a list of urls.
First generate a unique filename, then copy the header into it. The while loop iterates over a list of urls, and pulls the data into a variable, because the script needs to check for several items, and this avoids to send three requests. I added the timeout parameters to wget, because several domains I tested did not send ANY response when missing the subdomain, and the script hung up.
Next steps are the three filters for og:title, rel publisher and schema (itemtype), into variables, then writing to the line with the url. Done.
#!bash
filename=topresults-$RANDOM.txt
echo -e '\turl\tog:title\trel_publisher\tschema' > ${filename}
while read -r line; do
file=$(wget -qO- -t 1 -T 10 --dns-timeout=10 --connect-timeout=10 --read-timeout=10 "${line}")
title=$(echo "${file}" | grep 'og:title' | wc -l)
if (( "$title" > 0 ))
then title="yes"
else
title="no"
fi
publisher=$( echo "$file" | grep 'rel="publisher"' | wc -l )
if (( "$publisher" > 0 ))
then publisher="yes"
else
publisher="no"
fi
schema=$( echo "$file" | grep 'itemtype="' | wc -l )
if (( "$schema" > 0 ))
then schema="yes"
else
schema="no"
fi
echo -e "$line\t$title\t$publisher\t$schema" >> ${filename}
done < $1
wc -l $filename