Pingdom Home

US + international: +1-212-796-6890

SE + international: +46-21-480-0920

Business hours 3 am-11:30 am EST (Mon-Fri).

Do you know if your website is up right now? We do! LEARN MORE

Online launch troubles and how to avoid them

It’s a common scenario: A new website launches after having built up a lot of hype around its service or product, only to almost immediately crash due to overwhelming traffic. These launch troubles are almost always scalability-related.

We see this happening a lot. It may sound like a luxury problem (wow, too many users!), but think about it: If you’ve created something special and spent lots of effort building up expectations and buzz around your product, you don’t want anything to stand in the way of people finally trying it out, do you?

Here are some real-world launch troubles from 2008:

  • Cuil – Before its launch, Cuil was hailed as a potential Google killer search engine. Once it launched, the site crashed due to excessive traffic. Ultimately, the product didn’t really deliver as promised, but their initial downtime certainly didn’t help matters.
  • Europeana – This is a EU initiative to make a huge repository of documents and artwork from major European museums available online for anyone to view. When it launched, demand proved overwhelming (apparently reaching 10 million hits per hour) and the site was taken down and the re-launch delayed with over a month (it is now up in a limited-capacity beta).
  • MS Photosynth – This 3D-from-2D-photos web app turned out to be more popular than Microsoft expected. Microsoft stated that “traffic has far exceeded even our most optimistic expectations” and started adding more hardware so their servers would stop failing.
  • Apple MobileMe – MobileMe suffered from significant problems during its launch, and its reputation hasn’t fully recovered yet.
  • New versions of Firefox, Ubuntu, OpenOffice.org – Mozilla had site issues due to the massive demand for the new Firefox 3, Ubuntu has had site issues several times in connection with launching new versions of Ubuntu, and when the new OpenOffice.org suite launched recently the website was so swamped with traffic that it had to keep a simple replacement page up for days.

Before we delve into how to deal with these kinds of website launch issues (and hopefully avoid them), let’s categorize the different kinds of product launches there are, since website downtime will affect them differently.

Service vs. downloadable, new vs. existing products

We have mentioned two kinds of products above:

  • Online service/website: For these products it’s clear-cut. If the website doesn’t work, your product doesn’t work, so it’s essential that this doesn’t happen. Broken website = broken product.
  • Downloadable product: These products will in many cases be relying on a website to start the download from. You need to keep this website operational to keep downloads going, or risk turning people away (some of them may never come back).

There is, aside from the division between downloadable software products and online services, another fundamental difference to keep in mind, which affects how easy it is to estimate future traffic demands:

  • New versions of existing products: A website with an existing product will have an easier time estimating what kind of traffic levels they are likely to reach when a new version is launched. The product has already been live for some time so past traffic levels are known and can be used to estimate future ones (and there have been other product launches to learn from).
  • Completely new products: A website with a completely new product will have a much harder time estimating the initial demand. You can have an idea from the buzz surrounding your product before it launches (in forums, on blogs, etc), but it will be a guesstimate at best. Incorrectly estimating the demand can very easily happen, especially if your product ends up being featured on some popular websites that you hadn’t counted on.

And now just one more thing before we head on to the potential solutions to the launch crash dilemma.

Playing the Devil’s advocate – Are launch crashes good for you?

Some of you may ask yourselves, well, these sites that go down, they often end up getting press. The more the product has been hyped beforehand, the more people write about how it crashed on launch day due to “overwhelming demand”. Perhaps this is a good thing and will give the product some extra attention?

While it’s likely that this has been used as a marketing trick on some occasions, we believe it’s short-sighted and should be avoided. Uptime and reliability are important issues for all kinds of SaaS products, and you don’t want to put that blemish on your reputation the first thing you do.

So don’t do it. Make a professional first impression.

Some ways around the launch scalability problem

Aside from scaling up your site permanently to always be able to handle the traffic spikes you may experience during the launch (which would be ideal, of course, but not everyone can afford this), here are some alternative solutions to handle launch traffic spikes:

  • Trickle launch, i.e. use invites over time to restrict usage and traffic (example of sites that are using/have used this approach are Spotify and Gmail). This can also be combined with marketing, making it feel “exclusive”. This is a good approach for online services (SaaS) since they can control their growth this way.
  • Use mirrors. For downloadable software product launches, provide download mirrors to offload your main website.
  • Simplified homepage. You can expect that most people will just type in your regular URL first, so on launch day you may want to have a special, light version of the homepage ready (preferably a static one). This may not be possible for web services, but for sites offering a software download it can be helpful.
  • Temporary extra capacity. Temporarily lease extra capacity, for example via a CDN, or have some extra servers (and bandwidth) ready.
  • Identify your bottlenecks. While not a solution in itself, try doing as much load testing as you can beforehand to get an idea what your current system can handle. How much load is your database handling? What is the load on your web server(s)? Which are the main bottlenecks in your system? Identifying factors such as these will help you add capacity (or optimize the system) in an efficient manner.

Ideally, we would all be using automagically scaling cloud hosting solutions, but we’re not quite there yet.

Do you have additional tips to share? What do you do to handle extreme traffic spikes to your site? Let us know in the comments!

Image from 2theadvocate.com.



6 Comments

there’s one thing to do: leverage virtualization services, and get an architecture that can auto-scale and add capacity. Ideally, you should work with a provider that will bill you per usage

Really good advice.

I have some other thoughts (if the platform domain which the site uses can logically allow for this kind of behavior)… First, consider using sub domains to reduce and distribute the network stress and load.

By grouping functions which can logically break over sub domains, you can use 1 or 1000 computers in 1 or more networks to support your site and you can scale up as you see fit simply by modifying your DNS host files.

I note, looking at my address bar right now, that Pingdom itself seems to use this very technique. A technique that, if implemented properly, can be a really useful tool in making sure a site stays (or at least parts of a site stay) online through proper load management.

So, how could you make that work in an example? Let’s say you had a domain like example.com to launch your next great site. Say that your site had a signup process which required some additional horsepower from a system to do whatever you needed to make the signup work for your site and that similar resources or load might be needed in a login scenario. You could setup a Join.Example.Com for registration and a Login.Example.Com for logging into your site.

While I won’t bore you with details, it is conceptually pretty straight forward to do this on multiple machines. If you plan for this at the outset of your site launch, it may save you headaches as your site scales and you discover that you need to distribute function load processing over multiple machines or networks.

Next of course, you will also need to pay attention to the bandwidth if your site is heavily network I/O intensive (e.g., download site). I think more often than not, new sites (which are not static and who does those these days?) will face CPU, memory and disk I/O challenges (e.g., database) before they hit network bottlenecks.

One final (and I think a rather important) thing. Be sure to educate your users on why you are doing what you have done to help them with a better user site experience with sub domains. Be very sure to point out that they should never confuse the sub domain with someone using the primary domain as a sub domain, as is the case when so called “phishing” sites attempt to attack your site. In other words make it very clear to your users that:

this.example.com this.example.com.badguys.tld

Really finally, much good luck with your new site launch – when you bring something useful online, hopefully so many others will want to participate in your new project that you will really need this advice!

What are your guys thoughts on cloud computing? Seems like it could help avoid a LOT of this sort of stuff…