Pingdom Home

US + international: +1-212-796-6890

SE + international: +46-21-480-0920

Business hours 3 am-11:30 am EST (Mon-Fri).

Royal Pingdom

Gmail could be unavailable for more than 21 hours in a day, and Google could still tell you that according to their SLA, the service has had 100% uptime.

It sounds impossible, but it’s a direct consequence of how Google has written its SLA for Google Apps (which includes Gmail, Google Docs, Google Calendar and more). We will explain this in detail further down, but let’s first look at what the SLA actually says.

What the Google Apps SLA says

Here are the key parts, quoted from the Google Apps SLA, emphasis added by us:

“Downtime Period” means, for a domain, a period of ten consecutive minutes of Downtime. Intermittent Downtime for a period of less than ten minutes will not be counted towards any Downtime Periods.

[...]

“Monthly Uptime Percentage” means total number of minutes in a calendar month minus the number of minutes of Downtime suffered from all Downtime Periods in a calendar month, divided by the total number of minutes in a calendar month.

So, “downtime periods” are what’s ultimately used for counting the uptime percentage for Google Apps, and these downtime periods ignore all downtime that lasts less than 10 minutes.

A worst-case scenario

Now back to our initial statement. How does the SLA make it possible to have more than 21 hours of downtime in a day and yet Google would call it 100% uptime?

Here is the problem: What if Google Apps was down for 9 minutes, up for 1 minute, down 9 minutes, etc. That would mean 54 minutes of downtime each hour, but Google still wouldn’t count it because none of the individual downtimes lasted 10 minutes of more.

Over a day (24 hours), that’s 21 hours and 36 minutes of downtime that Google would simply ignore when calculating the final uptime percentage.


Above: Red is downtime, green is uptime. Note that none of the downtime periods above last 10 minutes or more and thus are not counted according to the Google Apps SLA.

It’s an extreme and very unlikely worst-case scenario, but we wanted to illustrate the consequence of how Google’s SLA sums up its downtime and calculates its uptime percentage.

A more likely scenario

Now let’s take a much more likely example of intermittent problems:

3m, 8m, 12m, 5m, 9m, 14m, 4m = 57 minutes of actual downtime

But Google would only count this as 26 minutes of downtime, including only the downtime periods lasting 12 and 14 minutes.


Above: In this scenario, only the downtime periods lasting 12 and 14 minutes (marked with yellow) would be counted according to the Google Apps SLA.

Short outages are common in the real world

The problem with the Google Apps SLA is that short outages, less than 10 minutes in length, are actually very common in the real world.

As you may know, we here at Pingdom run an uptime monitoring service, and we know from our own experience (and a LOT of data from thousands of sites) that it’s much more common for sites to have multiple short intervals of downtime instead of a few long ones.

The 99.9% monthly uptime guarantee in the Google Apps SLA allows for 43 minutes of downtime in a 30-day month, but ignoring problems that last less than 10 minutes at a time will definitely make it much easier for Google to honor its SLA.

Read more about Pingdom

11 Comments

Actually it is even more interesting, if they monitor by the second, which I’m sure they do.

9 Mins and 59 seconds would not count as downtime, so you can have 9:59 mins down and 1 second up, which would give you basically about 2 minutes of uptime daily.

Leave a Reply

Comments are moderated and not published in real time. All comments that are not related to the post will be removed.


Major Google App Engine hiccup reveals weaknesses

Google’s App Engine suffered from increased data access latency and errors yesterday, including problems serving applications. According to TechCrunch, the problems lasted for approximately six hours.

From the App Engine status page:

On July 2nd, all applications experienced increased error rate and latency with read and write Datastore and memcache operations, as well as some serving errors. Datastore access and serving have been fully restored as of 12:25 PM PDT.

What happened yesterday exposed a couple of interesting weaknesses for App Engine.

Read more

Pingdom adds FREE website monitoring

We have exciting news to share. As you may have noticed, we made some changes to the Pingdom website yesterday, and the main thing we added was a new account type that many of you are going to love: Pingdom Free.

Now, for the first time ever, you can use Pingdom for free. We’re not talking about a free trial, but a completely free account that you can use for as long as you like, no strings attached.

In other words, you are getting a professional uptime monitoring service for free. With the Pingdom service, you’ll be the first to know when your site goes down.

Read more

A gallery of geeky galleries

If you’ve been following this blog for a while, you’ll know that we love everything geeky, and we have often put together themed galleries that appeal to tech geeks like ourselves.

Here is a collection of some of the geekiest galleries that have come and gone on this blog.

Read more

Wordpress.com set to grow past 10 million blogs in 2009

Wordpress.com, the popular blogging service from Automattic, has some interesting growth statistics posted on its website. Among other things, there is a graph showing how many new blogs are created on the service each day.

Based on the graphs that Automattic provides us with, it’s actually not that difficult to estimate how much Wordpress.com will grow in 2009. Which, of course, was a temptation we couldn’t resist!

Read more

The triumph of Linux as a supercomputer OS

Operating systems on supercomputers used to be custom-made affairs, but this has changed. These days, Linux has become a popular choice for supercomputers. But how popular? You may be surprised.

Top500.org maintains a list of the fastest supercomputers in the world. A new list was published yesterday (it happens twice a year), so we took the opportunity to go through the list and find out what OS the top 20 supercomputers are using.

It took some work, but the results are interesting.

Read more