Tuesday, February 3, 2015

Page size by position in the site structure

Large sites have specific requirements at times, for example when checking for site speed optimization.
One factor influencing page speed, usability and ease of being crawled by search engines is the total page size. With millions of pages, how can focus our work beyond just finding examples of what works and what not?

We ran an internal crawler to identify the pages collecting data on page size for each page. Second step then was visualization, to see if there is a pattern that allows us to focus on high impact areas. 



For a small part of the site (!), this is the printed plot of the page weight (with R) based on the number of “/” (-3) = folder depth, adjusted for the htt://dell/ slashes. As a result, http://www.dell.com/index.aspx would be 0, and http://www.dell.com/suppport/home would be level 1. 

Based on the graph we were able to clearly identify the area where large pages sit, and were able to narrow it down to a type of page. This gave the necessary information to work with Dev and Design to improve the site significantly. 

We also identified a range of descriptive statistics numbers, and the 'outliers' and could fix these immediately - like many will have guessed, not optimized pictures were the issue, and easily remedied. 

No comments:

Post a Comment

Bookmark and Share