Pingdom Home

US + international: +1-212-796-6890

SE + international: +46-21-480-0920

Business hours 3 am-11:30 am EST (Mon-Fri).

Do you know if your website is up right now? We do! LEARN MORE

Sweden’s Internet broken by DNS mistake

Last night, a routine maintenance of Sweden’s top-level domain .se went seriously wrong, introducing an error that made DNS lookups for all .se domain names start failing. The entire Swedish Internet effectively stopped working at this point. Swedish (.se) websites could not be reached, email to Swedish domain names stopped working, and for many these problems persist still.

According to sources we have inside the Swedish web hosting industry, the .se zone, the central record for the .se top-level domain, broke at 21:19 21:45 local time and was not returned to normal until 22:43 local time.

However, since DNS lookups are cached externally by Internet service providers (ISPs) and web hosting companies, the problems remained even after that. It wasn’t until around 23:30 local time last night that the major Swedish ISPs had flushed their own DNS caches, meaning that they cleared away the broken results so that new DNS lookups could start working properly again. If they had not done this the problem would have remained for a full 24 hours.

There are still a large number of smaller ISPs that have not yet fixed the problem. It is also likely that ISPs outside of Sweden is not aware of the incident, so the effects of the problem may remain there as well.

We (Pingdom) are based in Sweden, so we have witnessed the massive effects of this incident firsthand and also the widespread frustration from end users. The incident is also receiving a significant amount of media attention.

What exactly happened?

The problem happened during planned maintenance of the .se domain. The .SE registry used an incorrectly configured script to update the .se zone, which introduced an error to every single .se domain name.

We have spoken to a number of industry insiders and what happened is that when updating the data, the script did not add a terminating “.” to the DNS records in the .se zone. That trailing dot is necessary in the settings for DNS to understand that “.se” is the top-level domain. It is a seemingly small detail, but without it, the whole DNS lookup chain broke down.

The problems were made worse by the fact that DNS lookups are cached externally. Since DNS lookups are cached a certain time and the .se zone has a 24 hour time-to-live (the time information is cached by external DNS servers), the problem could last for up to 24 hours for some users.

The solution once the problem had been corrected was to “flush” the cache of external DNS servers, i.e. empty their cache, but this can only be done by the ones controlling the DNS servers, usually ISPs and web hosting companies. The end user has little control over this and is left at the mercy of his/her ISP.

The implications

Pingdom monitors the uptime of tens of thousands of websites for our customers, and we often see downtime due to DNS problems. These problems are very common all over the world, but usually it’s a single domain name that has been incorrectly configured or the DNS servers of a single web host having problems. An entire top-level domain breaking is exceptionally rare.

Problems that affect an entire top-level zone have very wide-ranging effects as can be seen by the .se incident. There are just over 900,000 .se domain names, and every single one of these were affected.

Imagine the same thing happening to the .com domain, which has over 80 million domain names. Although not all of these are actually in use by websites or for email, the effects would still be huge and cause an unprecedented amount of downtime across the entire Internet.

Update: According to a statement issued by the .SE registry the problem started at 21:45 local time, not 21:19 as we previously noted from our source. Changed this accordingly.

Check your DNS health here.



18 Comments

Since the entire .se zone was broken, imagine what happened to other domains that had name servers like ns1.company.se… Those domains broke down as well. :(

What the article doesn’t say is *where * this happened – was this in Sweden, or in the US?

One who remembers

October 15th, 2009 at 7:28 am


Reminds me of the days when a missing “.” in a COBOL program or the placement of a symbol in JCL would be one position off; these would drive mainframers bonkers. It’s amusingly ironic that technology has advanced so far yet is still subject to the same problems as its dinosaur ancestors – human error.

Who cares?

i think the problem is that these linux / unix chaps use too notepad too much. vs apps that check values and could have some intelligence built in.

Tushar Kapila: They have failsafes in place, afaik, but apparently in this case something went wrong in the routines.

Only just found out about this. It’s interesting to me because, as far as I can tell, this happens to the dot-zm (Zambia) TLD on a regular basis.