Pingdom Home

US + international: +1-212-796-6890

SE + international: +46-21-480-0920

Business hours 3 am-11:30 am EST (Mon-Fri).

Royal Pingdom

How to stop an outage from becoming an outrage

Sooner or later, every site or application will fail. However the consequences depend not only on how the failure is managed but also on how it is communicated. Recently the web hosting company Media Temple and even Google have well illustrated how hard it is for modern connected organizations to respond quickly enough to system outages. Here’s a suggested crisis checklist and notes on the difficulties of always practicing it.

On Saturday, February 28, a storage cluster at Media Temple failed, depriving thousands of customers of their service until the following Monday morning. In the process, the company did not mass e-mail its customers or swiftly seem to update anything other than the system status account on Twitter. Only later did the company attempt to send private messages to the accounts of some irritated customers. This quickly led to outrage on blogs and online communities.

Similarly, in an incident covered here as well as elsewhere, Google faced a similar crisis four days earlier when its Gmail service stopped functioning globally for 2-4 hours. As millions of users and companies were unable to use their e-mail, the company communicated only very briefly on its official blog. Of course, very quickly the big media, blogosphere and communities were on fire with messages about “Gfail”.

These examples show how modern organizations need to excel in following the deceptively simple rules of crisis communication – always try to reserve the capacity to:

I. Preparations:
  • Define your main stakeholders – customers, investors, partners, suppliers etc.
  • Keep an eye on big real-time forums where they may communicate.
  • Define what a serious error is and how to notice when one has happened.
II. Urgent actions:
  • Define what has happened as far as possible – be careful to separate facts from guesses.
  • Define what to do about it – recovery, calling in extra resources etc.
  • Define which stakeholders are affected.
  • Define how to communicate with these groups – avoid speculation and optimistic promises in favor of continuous updates, addressing the information vacuum and user frustration.
  • Start communicating.
III. Follow-up:
  • Respond quickly to further questions from key stakeholders – always stick to the facts/message as agreed above and avoid speculation.
  • If an error has been committed, offer apologies and remuneration (which both the mentioned companies currently have done).

With hindsight, Media Temple reacted as quickly as possible, throwing all resources at solving the issue – and forgot to communicate actively with their customers, generating anger and accusations that might have been avoided. Google for their part aggravated the error by reacting first offering erroneous information to its users – the failure was hardly “limited to a small subset of users”.

Both companies were hung high on Twitter, underlining the need for organizations to monitor real-time communities like this who can improve or aggravate the situation by instantly spreading information – if such is available. Media Temple later claimed that it lacked the staff resources to handle the thousands of micro conversations.

It may be the case that in this kind of situation the best course for a company may be to define its one message, mass-communicate and update this actively and avoid speculation or individualization. This is when it is beneficial to have one single source of information that all customers can be referred to for status updates, for example an externally hosted status blog.

So, are we saying that by following the above rules, communications mishaps could never happen? Of course not, the answer is that crisis management is never easy – otherwise it wouldn’t be a crisis.

Do you have any examples of superb crisis communications – or the opposite?

Please don’t hesitate to share them with us in the comments.

Want to test your site every minute?








You will get an email with your login information.

3 Comments

Easy to say – hard to do
…but the list is one starting point

Leave a Reply

Comments are moderated and not published in real time. All comments that are not related to the post will be removed.


Study: Males vs. females in social networks

Have you ever wondered how many of Twitter’s users are women? Or men? What about Facebook, MySpace, Digg, LinkedIn, and other sites in the social media sphere?

We have tracked down this information for a number of social network sites (19 of them). All the major ones have been included, like Facebook, MySpace and Twitter and also some of the most popular social news sites; Digg, Reddit and Slashdot.

Read more

10 of the most popular (and useful) Wordpress plugins

Wordpress has risen to be a powerhouse on the Internet that now dominates the blogosphere. It was started by the (now) 25-year-old Matt Mullenweg. Last week he was on This Week in Startups with Jason Calacanis. On the show Matt revealed that Wordpress has such a strong presence on the Internet that at least one in three Americans online have visited a Wordpress blog in the last month.

Wordpress lets you use thousands of powerful plugins that complement and extend the platform in a variety of ways. I have scoured the Wordpress Plugin Directory to find the very best plugins to share with you in this post.

Read more

How Google’s Chrome OS is pushing us to the clouds

Last week Google finally unveiled their much-talked-about Chrome OS, and subsequently worked the tech community into a frenzy. The operating system certainly lived up to Google’s initial promises of being browser-centric – it is basically just the Chrome web browser atop a custom Linux kernel.

Chrome OS is a momentous step towards making the fuzzy concepts of cloud computing more of a distinct reality. What follows are a few reasons why I think it matters, and how it will change the computing landscape by bringing us closer to the cloud than ever before.

Read more

Is the new wave of low-cost ultraportables a threat to netbooks?

It used to be that you’d pay a significant price premium for a slim ultraportable laptop – machines that were smaller and lighter than typical 5-6lb laptops. In the days before netbooks, they were really your only option for getting a thin and light laptop. But now that netbooks have carved out a segment of cheap and portable computers in the $200-$500 range, the ultraportables needed to adapt as well.

Read more

A look at the fastest supercomputer in the world

Twice a year, the world’s top 500 supercomputers are announced. The most recent winner is the Jaguar which pretty much wiped the floor with the competition, managing a performance benchmark 69% above the IBM Roadrunner which came in second.

Let’s take a closer look at the Jaguar, the fastest supercomputer in the world today.

Read more