Webpages Are Getting Larger Every Year, and Here’s Why it Matters

Webpage size matters because it correlates with how fast users get to your content. People today have grown to expect good performance from the web. In fact, Google published data showing that 53% of mobile site visits are abandoned if a page takes longer than three seconds to load. And the more data your webpage needs to download, the longer it will take—particularly on slow mobile connections.

Balancing a rich experience with page performance is a difficult tradeoff for many publishers. We gathered statistics from the top 1000 websites in the world to see how large their pages are. We’ll look at what’s driving this change and how you can track the size of your own company’s site.

Recent trends

According to the HTTP Archive, the current average page size of over a million top sites worldwide is around 1,400 KB, which has steadily increased over the years. This is based on measuring transferSize, which is the weight of the payload of HTML, as well as all of its linked resources (favicon, CSS files, images), once fully loaded (i.e. at the window.onloadevent).

Fig. 1. Graph of mean Kilobyte totals (Nov 2010 to Aug 2018) by The HTTP Archive

With broadband speeds increasing every year, publishers have added richer content to their webpages. This includes larger media such as images and video. It also includes increasingly sophisticated JavaScript behavior using frameworks like React and Angular.

Additionally, the complete access-speed equation should also consider the average Internet speed in your server and user countries. So if they’re in China or Brazil (with internet speeds over three times slower than the USA in 2017, according to Fastmetrics) keeping all your webpage sizes under the global average would seem more of a necessity than just a good practice.

Fig. 2. Average Internet Speeds [Mbps] by Country in 2017 by Fastmetrics. © 2018 Fastmetrics, Inc. All rights reserved.

Top 1000 websites

Alexa Top Sites is a web service hosted on Amazon Web Services (AWS) that allows users to fetch most visited websites (per country or globally) from Alexa’s database. Here’s a public preview of the same data, from which you can manually compile the top 50 websites worldwide. We employed their data from September 2018 to compile the top thousand sites for use in the remainder of this article. For example, here are the top 10:

  1. google.com
  2. youtube.com
  3. facebook.com
  4. baidu.com
  5. wikipedia.org
  6. yahoo.com
  7. qq.com
  8. taobao.com
  9. tmall.com
  10. twitter.com

Actual webpage size

With Google Chrome, it’s possible to manually check a webpage’s total-size by starting an Incognito window (or clearing the cache), opening the DevTools Network tab, and loading the webpage. (Refer to figure 3 in the following section.)

In order to perform this same task in an automatic way for 1000 sites, we wrote a Python scraper program (code on GitHub) that uses Selenium and Headless Chrome to calculate the actual total webpage sizes (including dynamic content loaded by JavaScript before the user starts interacting).

Headless Chromium is a feature of Google’s browser starting on version 59. To use it, the chrome executable runs from command line with the –headless option. We operate it programmatically with a special WebDriver for Selenium (Python-flavored in our case). We also use the Chrome DevTools Protocol to access the Network.loadingFinished events using the RemoteWebDriver. For this, ChromeDriver is running standalone, which by default listens on a remote debugging port 9515 on the local network, available for us to connect using Selenium. Additionally, performance logging is enabled in our code.

All this has been provided in our sample code at github.com/jorgeorpinel/site-page-size-scraper.

Context and limitations

Some of the 1000 websites may be skipped by our tool, given the following rules:

  • 10-second total page loading timeout;
  • 10-second script execution timeout;
  • Ignored when the response is empty;
  • Scraper tool ran from the USA. (Some websites are not available or present different content when loaded from different locations.)

Note: Some top webpages from other countries e.g. China didn’t load in the USA or redirect to global content landing pages. The correct way to measure them would be to load each from inside their country but that goes beyond the scope of this article.

Gathering the statistics

We ran the tool providing a list of websites as its only argument:

$ ./from_list.py 2018-09-17-alexa-topsites-1000.txt

Loaded list of 1000 URLs:
Loading http://google.com... loadingFinished: 395332B, 1.96s
Loading http://youtube.com... loadingFinished: 1874222B, 3.16s
Loading http://facebook.com... loadingFinished: 1387049B, 1.21s

The average webpage size is 2.07MB from 892 processed websites...

 

To confirm these figures, we compared to manually loading google.com with DevTools open in the Network tab:

Fig. 3. The page size of https://www.google.com/ is around 390 KB, as seen on the bottom status bar. © 2018 Google, Inc. All rights reserved.

Our results and statistics

We collected data on the size of Alexa’s top 1000 sites on September 15, 2018 and found an average of 2.07 MB per website (see results data sheet). Running the tool only for the first 50 top websites gave us a lower average of around 1.2 MB. This makes sense since better websites are generally lighter, correlating to their popularity. It’s also worth noting that our 2.07 MB average is contrastingly higher than HTTP Archive’s 1.4 MB (even when that’s for over a million websites), however, this conforms to our definition of “total size” of a webpage, previously described.

Percentile stats (webpage size and loading time + distributions)

Fig. 4. Percentile groups for top 1000 web page total size in Megabytes

As shown in the graph above, most of the websites analyzed are below the average. In fact, they’re under 2 MB. This implies that some of them are extremely heavy, such as hqq.tv with over 16 MB, or blizzard.com with more than 13.

Top 5 smallest domains Top 5 largest domains
t.co      0.002 MB gfycat.com      29.25 MB
t66y.com         0.005 MB hdd.tv              16.93 MB
exhentai.org    0.01 MB [adult site]         16.52 MB
gstatic.com     0.01 MB abs-cbn.com   14.33 MB
golnk1.com     0.02 MB blizzard.com   13.63 MB

Fig. 5. Top 5 smallest and largest websites in our test, by their home/landing page’s total size.

 

The smallest domains aren’t presenting content directly. For example, t.co is just a URL shortening service that redirects users to the full URL. Also, gstatic is mainly used to deliver assets used by other webpages. If we removed these from the results, the average size would be even higher.

Why are some page sizes so large? Judging from the names of these top examples, heavy graphics, or interactive content loads for all users as soon as they land on their home pages.

Fig. 6. Size and number of requests classified by content type for Blizzard’s website via Pingdom.

Checking total size of your webpage with SolarWinds Pingdom

A quick and easy way to check your total page size is using the SolarWinds® Pingdom® Website Speed Test. This free tool also uses real web browsers in dedicated servers distributed in different global locations to load and analyze website performance. It also adds significant insight into the composition of different aspects of a website’s performance.

Fig. 7. Basic performance analysis of google.com via Pingdom.

The Pingdom online tool also separates page size by content type (images, scripts, etc.) and by domain (to differentiate resources coming from the same website, CDNs, third parties, etc). In fact, this is how we generated figure 6 in the previous section.

Pingdom is an easy-to-use website performance and availability monitoring service that helps keep your websites fast and reliable. Signing up is free and registered users can enjoy a myriad of tools such as page speed monitoring, real user tracking, root cause analysis, website uptime monitoring, nice looking pre-configured reports, and a full REST API.

To mention just one great feature among all of the Pingdom solution’s offerings, Real User Monitoring (RUM) is leveraged automatically to create greater insight on the regional performance of your website. Now you can get insight into how real users experience the performance of your site around the world.

Fig. 8. Experience Monitoring / Visitor Insights (RUM) map in Pingdom.

Conclusion

You already know your audience. Knowing your website’s page sizes will allow you to better control the performance and availability of your content and applications. Everyone loves a fast website!

Sign up for a free trial of SolarWinds Pingdom to monitor your users’ digital experience, such as uptime monitoring, visitor insights, page speed monitoring, and immediate alerts.

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

Leave a Reply

Comments are moderated and not published in real time. All comments that are not related to the post will be removed.