Pingdom Home

US + international: +1-212-796-6890

SE + international: +46-21-480-0920

Business hours 3 am-11:30 am EST (Mon-Fri).

Pingdom Blog

Royal Pingdom

Ramblings from the Pingdom team about the Internet and web tech

RSS Feed

Is the Web heading toward redirect hell?

Loading...Google is doing it. Facebook is doing it. Yahoo is doing it. Microsoft is doing it. And soon Twitter will be doing it.

We’re talking about the apparent need of every web service out there to add intermediate steps to sample what we click on before they send us on to our real destination. This has been going on for a long time and is slowly starting to build into something of a redirect hell on the Web.

And it has a price.

The overhead that’s already here

There’s already plenty of redirect overhead in places where you don’t really think about it. For example:

  • Every time you click on a search result in Google or Bing there’s an intermediate step via Google’s servers (or Bing’s) before you’re redirected to the real target site.
  • Every time you click on a Feedburner RSS headline you’re also redirected before arriving at the real target.
  • Every time you click on an outgoing link in Facebook, there’s an inbetween step via a Facebook server before you’re redirected to where you want to go.

And so on, and so on, and so on.

This is, of course, because Google, Facebook and other online companies like to keep track of clicks and how their users behave. Knowledge is a true resource for these companies. It can help them improve their service, it can help them monetize the service more efficiently, and in many cases the actual data itself is worth money. Ultimately this click tracking can also be good for end users, especially if it allows a service to improve its quality.

But…

Things are getting out of hand

If it were just one extra intermediary step that may have been alright, but if you look around, you’ll start to discover more and more layering of these redirects, different services taking a bite of the click data on the way to the real target. You know, the one the user actually wants to get to.

It can quickly get out of hand. We’ve seen scenarios where outgoing links in for example Facebook will first redirect you via a Facebook server, then a URL shortener (for example bit.ly), which in turn redirects to a longer URL that in turn will result in several additional redirects before you FINALLY reach the target. It’s not uncommon with three or more layers of redirects via different sites that, from the perspective of the user, are pure overhead.

The problem is that that overhead isn’t free. It’ll add time to reaching your target, and it’ll add more links (literally!) in the chain that can either break or slow down. It can even make sites appear down when they aren’t, because something on the way broke down.

And it looks like this practice is only getting more and more prevalent on the Web.

A recent case study of the “redirect trend”: Twitter

Do you remember that wave of URL shorteners that came when Twitter started to get popular? That’s where our story begins.

Twitter first used the already established TinyURL.com as its default URL shortener. It was an ideal match for Twitter and its 140-character message limit.

Then came Bit.ly and a host of other URL shorteners who also wanted to ride on the coattails of Twitter’s growing success. Bit.ly soon succeeded in replacing TinyURL as the default URL shortener for Twitter. As a result of that, Bit.ly got its hands on a wealth of data: a big share of all outgoing links on Twitter and how popular those links were, since they could track every single click.

It was only a matter of time before Twitter wanted that data for itself. And why wouldn’t it? In doing so, it gains full control over the infrastructure it runs on and more information about what Twitter’s users like to click on, and so on. So, not long ago, Twitter created its own URL shortener, t.co. In Twitter’s case this makes perfect sense.

That is all well and good, but now comes the really interesting part that is the most relevant for this article: Twitter will by the end of the year start to funnel ALL links through its URL shortener, even links already shortened by other services like Bit.ly or Google’s Goo.gl. By funneling all clicks through its own servers first, Twitter will gain intimate knowledge of how its service is used, and about its users. It gets full control over the quality of its service. This is a good thing for Twitter.

But what happens when everyone wants a piece of the pie? Redirect after redirect after redirect before we arrive at our destination? Yes, that’s exactly what happens, and you’ll have to live with the overhead.

Here’s an example what link sharing could look like once Twitter starts to funnel all clicks through its own service:

  1. Someone shares a goo.gl link on Twitter, which automatically gets turned into a t.co link.
  2. When someone clicks on the t.co link, they will first be directed to Twitter’s servers to resolve the t.co link to the goo.gl link and redirect it there.
  3. The goo.gl link will send the user to Google’s servers to resolve the goo.gl link and then finally redirect the user to the real, intended target.
  4. This target may then in turn redirect the user even further.

It makes your head spin, doesn’t it?

More redirect layers to come?

About a year ago we wrote an article about the potential drawbacks of URL shorteners, and it applies perfectly to this more general scenario with multiple redirects between sites. The performance, security and privacy implications of those redirects are the same.

We strongly suspect that the path we currently see Twitter going down is a sign of things to come from many other web services out there who may not already be doing this. (I.e., sampling and logging the clicks before sending them on, not necessarily using URL shorteners.)

And even when the main services don’t do this, more in-between, third-party services like various URL shorteners show up all the time. Just the other day, anti-virus maker McAfee announced the beta of McAf.ee, a “safe” URL shortener. It may be great, who knows, but in light of what we’ve told you in this article it’s difficult not to think: yet another layer of redirects.

Is this really where the Web is headed? Do we want it to?

Want to test your site every minute?








You will get an email with your login information.

53 Comments

In the cases of some sites, like webmail clients and Facebook the redirect is a security mechanism. The referrer would otherwise contain some information that could be harmful (id of message, username, whatever). This redirect sanitizes so that only one url shows up in a servers logs.

Totally agree. The redirect hell is driving me mad. I hate it when they add bars of noice (I’m looking at you stumbleupon)

If twitter (and the rest) were clever, they would look at the URI being shortened and follow it to the end of it’s logical path…. and then cut out all the middlemen. I’m sure there would be some outcry, but I don’t think the middle men have a leg to stand on.

I’m sure Google probably do this already, as I’m sure they want to know how many distinct URI’s point ot the same end page for search engine ranking… to this end the problem may be resolved by the URI shortening organisations working better.

now if we can just solve the problem of blog saying nothing and refering to other blog entries, which say nothing and refer to other blog entries, which say nothing and refer to other blog entries, which don’t actually then have the answer to what you were searching for, well, that would be a better use of resources ;-)

We see the same problem with 250kb JavaScript includes. Developers take for granted that everyone has a +1 Megabit internet connection (just like them), and no-one will notice the extra bytes here and there. Cellphone users really understand this.

This is one of the reason that AJAX has grown so popular. People are (*hint*) tired of waiting for the whole page to reload. Well, if the pages were smaller and didn’t require lots of redirects and includes then you wouldn’t need to rely on AJAX to fix the problem you created.

Then again, we see waste and bloat everywhere since it’s only natural for anything that makes money (i.e. Data mining) to push everything else aside.

So we need a better solution to track outgoing clicks. I see a market opportunity here.

That actually makes a LOT of sense when you think about it. Wow.

Lou
http://www.anon-web.tk

Don’t forget the suckiness for the site that is being linked to – no link juice is passed (which breaks how Google works, and sites get ranked) and no accurate referrer information is passed (where is all this traffic coming from?)

You bring up a valid point, but I doubt the average user will care since they are unaware of the issue. In some cases, there are valid reasons people create shorter links. I started using them early on so people with print versions didn’t have to type long URLs. I also use them for affiliate links.

I suspect most users care about 2 things other than the content. (1) Have I waited long enough for this page to load? (2) And this is a maybe…Is someone selling my data?

I can see where each layer adds to the overall time, but has Pingdom done analysis to see how much time on average is added per redirect?

Yet, if your connection speed and browser speed and server speed were super fast, would anyone notice or care?

Surprisingly I think this could be illegal. Remember all the lawsuits about HTML linking? Well, that seemed silly because thats the foundation of the web. I can link to your content if you let me.

But what if Im using your content as link-bait and doing all the redirects for tracking purposes without your knowledge?

Im essentially repurposing your content (for profit), and Im posting a link which claims to link directly to your site which does not.

Tracking can be done with cookies and with javascript, redirects are a huge waste of bandwidth and it will only get worse.

There are proxies and have been for a long time which will let you search google and strip out the redirect. You do notice the speed improvement, but its a good question as to why this is really even necessary.

Seems like 90′s technology.

What exactly do you pay these service providers for their products? In almost all cases, absolutely fu*qing nothing. So though you may a valid technical point, on the big scale you either leave the service and build your own or just deal with it and STFU.

The real “overhead” for many, many Web sites now is the linking to fifty ad servers on every page – and THOSE servers are either down or slow, so they don’t finish responding to the browser request in less than ten minutes.

Which is why your browser “busy” indicator stays that way even though the page appears to have been fully loaded – or worse, the page never loads.

This makes a difference when you try to save a page on your hard drive – that last little bit won’t save and the browser will tell you the save “failed” – in reality you got most of it except for one lousy little ad.

All of this is just the effect of the Internet industry running on too little server horsepower and too little bandwidth – and WAY too little brains.

As Woody Allen summed up the human situation, “Nothing works and nobody cares.”

The user experience problem is compounded when you consider that more people are consuming (or trying to consume) content on a mobile device. Looking at mobile Web performance speeds for the past few years it’s surprising that d/l speeds haven’t reflected the improvements made in both mobile network and device capabilities.

When analyzing mobile Web performance waterfall graphs you’ll commonly see excessive redirects as a timeout culprit. Slow on the Web can mean completely unavailable on a smartphone. Who cares what data you collect if that user is unlikely to return to your site.

Before I read this, I never noticed and didn’t care. Now I know and don’t care.

Just for laughs, I tested the round trip latency to t.co and bit.ly, both regularly came in ~ 80ms. If you really think that web users are going to get worked up over a couple hundred ms then you are mad as a hatter. This isn’t mission-critical-high-performance-realtime work, why does it matter if someone’s tweets take a little longer to read?

Tempest, meet teapot. Teapot, tempest.

So we have jQuery, and we have AJAX. Why don’t they just attach an onClick to their links that sends a quick POST to Google before sending the user on their way, directly to the site in question? It won’t work for people without Javascript on, but that’s such a small percentage that I doubt it matters to them much. The important thing is that they could get their statistics, while still avoiding a redirect.

(By the way, the link itself would still work for people without Javascript, since the href would just run without the onClick. They’d just lose the tracking for those people without Javascript, which, if I know those sorts of users, they’d be happy for anyway.)

I noticed a number of these when I clicked on THIS article. I see it all in process in the Opera address bar and it is hugely annoying. When I visit Slate it just goes on and on and on and never stops flickering to the extent I may have to stop reading Slate. The Philadelphia Inquirer has the same level of activity and it is increasing of late at the New York TImes as well. I would say it has gotten to ridiculous levels rather recently. in the past month or so, and I look forward to Opera and Firefox inventing some way to block all this extraneous activity behind a single click.

Doesn’t HTML 5 have a feature that makes this no longer necessary?

A post-back of sorts, wherein (for example) Google uses a direct link to the search result. The link contains a property that says “Let Google know you just clicked this link”.

“Redirect hell” is a hacky workaround to do exactly what this HTML 5 feature is intended for.

It has the added benefit of letting you turn off this behavior in the browser. But for precisely this reason, I can imagine the big boys who rely on this data (Google, Facebook, etc) continuing to use the old redirect methods.

There is an amusing anecdote in this and all technology. It has been said that the recent adoption of Blueray over HDDVD was due to which standard the adult film industry chose to adopt. Going back farther, the same thing happened in previous technologies.

Once again, we seem to be following in the footsteps of that industry. They have been tracking user pathing through their content via affliate programs long before the large conglomerates caught onto the trend, via multiple redirection architectures and have entire software packages for managing payment to their advertising affliates and also track conversions; which link paths led to an actual sale where a user paid to sign up, and not just a random click.

Some implementations of this scheme were complex apache httpd rewrite rules, and it evolved from there to include third party affliate mappings. These days such software has a full annotated database generating intermediate websites, site indexing for partner updates and so on.

Give it time; it’s all going this way, and porn did it first.

Also keep in mind that CDNs like Akamai work through DNS and HTTP redirect mechanisms on top of whatever the actual target servers are using. The redirect stack in a given stream these days is absurdly long. Humour yourself with a wireshark packet trace on a site to see just how unwieldy it is.

Browsers should simply treat these like symlinks and possibly resolve them before you even click on them.

Big whoop. There are serious issues at hand, and yet this takes precedence?

Seems to be a minor thing to me.

I think the more serious problem posed by the proliferation of redirect services is the loss of persistence in hyperlinks. The day TinyURL or Bit.ly go out of business, how many hundreds of thousands of links will be irreparably broken? Then again, if the majority of these links appear in inane tweets, maybe it’s no big deal.

The question is how will we users be “compensated” for suffering the disadvantages of the time delays due to the shorteners, when they will have the advantages of valuable mined data …

Is the “convenience factor” enough compensation?

Ani

What about a redirect loop? A->B->C->D->A….
Potential DoS?

Seems like it’s easy to foil Google’s scheme to watch which link you clicked: Instead of clicking a linked address, paste the actual address into the address bar and hit enter. Hey presto, watch me disappear! Which link did I click? Google will never know.

You aren’t even beginning to scratch the surface in regards to link overhead. Remember what is also happening to the public DNS infrastructure. Each of those links also require the bandwidth overhead for the DNS queries. Not to mention all their sponsers ads and own click recording service.

The really scary part comes when you realize how many diferent 3rd parties actually have personal data on Joe Schmoes daily internet habits and interests and how safe they keep that data.

To me, history repeating itself: starts out as a technical innovation, then the commercial opportunities are realized, then the primary driver becomes entertainment, with as much tracking as technically possible…

After a recent rebuild and o/s upgrade, I was slow to block ads. I was truly amazed at how many ads I saw after using blockers for a long time except on regular sites that I choose to support.

Even those sites that I allow to show me ads have seldom, if ever, obtained revenue from me clicking on an ad. I do all kinds of things to limit my exposure and clicking on any ad just seems to me like a crazy thing to do, given the phishing that goes on and the whole redirect issue.

During the few days I saw ads I was also shocked at how quickly I was targeted. The information about my interests that has been amassed and is now used to target me is a little unnerving.

So, back to blocking ads and javascript and I was very pleasantly surprised at how fast pages load.

I really don’t see the problem. If Twitter will replace all URL-shortener-links by links through their own shortener service, they will surely resolve the link to the final target before doing so. There, only one extra loop through only one URL-shortener-service, and not redirect-after-redirect-after-redirect.

None of the services mentioned are free. They’re all funded by advertising, which we pay for both in the time we spend looking at it and the increased retail prices of products.

Not this nonsense. A few milliseconds for a redirect, which allows people to know how well-viewed and well-targeted their content is, is a small sacrifice. Without it we’d be exposed to purely scatter-shot content, with no one having any idea how it’s being used. It’s been this way for 10 years – tracking redirects are not anything new.

Hello,

If redirects are driving you mad, stop using websites that make use of them.

Thanks,

Robson.

Google only does this when you’re logged into Google, as far as I can see, and then you’re opting in. I see no (big) problem with that.

They’re also doing it for suspected malware-infected sites, but then they’re not “re-directing”, instead rather “dead-ending” the link.

What is stopping a service like t.co from resolving all known redirectors like bit.ly themselves and directly redirecting the user to the non-redirecting result page?

Instead of the user going down the t.co->bit.ly->Google->Target chain, Twitter could do that in it’s own time and update the initial t.co->bit.ly link to t.co->Target. They would still get all the information they want (i.e. “how many people link to what” and “how many people clicked this link”) and their users get snappy performance. As a bonus, the full redirect chain only happens once. Everyone we care about wins.

I can see how bit.ly c.s. wouldn’t be amused, but there’s really no stopping it, unless of course they fight back and block any requests from known t.co ip ranges, but there’s ways around that too…

Two points:

1) Go to a page from one of the sites of my old company (digitalcamera-hq.com) and click on a link to go to a merchant site. Have Firefox’s Live HTTP Headers plugin installed and running. For some merchants, there are more than 15 intermediate steps, all trackers so that someone can get paid — any one server in the chain not responding? No one gets paid. There simply has to be a better way.

2) Run Google PageSpeed on a site with lots of Gravatar avatars on a single page. Each one will do a hit to gravatar which either returns an avatar (rarely) or a 302 redirect to the image you specify as part of the query string in the URL in the first place. On one of my blog pages, there are 190 comments (some day I’ll update the theme so it can paginate them) — the page took 15 seconds to load all of the gravatars. Page Speed complained about too many DNS lookups (multiple gravatar servers), too many requests, and too many redirects. All valid. I unchecked “Show Gravatar” and bada bing — problem solved.

And Google says, “Slow pages are bad, and we’re adding speed to our ranking algo now”. They’re right, and are probably better than most (they made an async version of Google Analytics), but it’s an issue.

The Internet is slowly turning into a Big brother

Buhu, buhu, it’s awful but what about IP routing? It’s even worse! Packets being routed back and forth with no foreseeable overhead. It’s a horrible waste of resources, lets go back to circuit switching technology and get rid of that bulky middle tier layer called Internet. If that is too much to ask at least kill the bloated stuff like HTTP, HTML and XML and replace it with Telnet or some other lean technique!

If the url shortening service shutdowns some day, it will cause link rot or dead links for all those urls shortened by that service.
For example tr.im has already stopped accepting new requests and they are shutting down their service by the end of 2010.

Google search results have direct URLs. Google does not track which search result you are clicking on.

You can easily verify this by hovering the mouse over the results, or by copying the link location and pasting it somewhere. Note how it is devoid of any Google address.

However, the sites you are navigating to from a Google search know what you were searching for, thanks to the Referer URL which is handed to them by your browser.

You can get around that by copying the link location and pasting it into your address bar. (Maybe there are browser-specific ways to disable the publication of referer).

Oops, I may be wrong there in my previous comment.

Looking at the page source of Google Page result, there is something suspicious. The URL anchros do have an “onclick” property.
For instance:

I.e. when we click on the search result, we are not going through an HTTP redirect, but it looks like there is Javascript code being notified of the click, and probably informing Google via a back channel.

This stuff looks like it could be easily removed by an HTML filter.

Oops, in my previous comment I wanted to show HTML, not to have it incorporated into the reply! I can’t believe that anonymous comments can have embedded HTML. Anyway the intended example was:


<a href="Somewhere" class=l onmousedown="return clk(this.href,'','','','4','','0CCEQFjAD')"Somewhere ...</b></a>

Note the onmousedown added by Google to the A tag.

Passing commenter

October 7th, 2010 at 4:23 pm


To be fair to Facebook, their redirector is there to anonymize outbound traffic and protect users’ profile privacy.

Heh, sneaky. Just use Firefox and the “Redirect Remover” add-on to rewrite the rendered page URLs and skip all of this nonsense in the first place.. At least, until they start encoding their URLs.

https://addons.mozilla.org/en-US/firefox/addon/537/

–W5i2

And yeah, the Facebook redirector is also there to protect FB users from spammy/fraudulent outbound links!

–W5i2

The only thing that worries me about the redirect/click tracking is that services like Google, Twitter, etc. now have complete and unadulterated access to your click habits and what you are browsing. In other words, you sacrifice privacy in return for Google’s profits (e.g., this will allow google to, as the author states, better ‘monetize’ traffic).

How can we stop this?
I have been using a VPN to halt any tracking/data monitoring ( http://www.privateinternetaccess.com/ ). You could also try using Tor ( http://www.torproject.org ). I noticed, though, that Tor has been really slow for me lately. It might be that a lot of people are using Tor for filesharing which is not what it was made for.

I hope this helps!

Perceptions matter, and the perception of Nokia in the news, on the web, and in the minds of many, is that things aren’t going that well. Even in the Pingdom office, we hear “Nokia is doomed,” but do the numbers support this belief?

Looking at the statistics, Symbian leads the mobile operating system race with just over 30% of web browsing traffic. That’s down slightly from late last year, when we noted that Symbian finished 2011 as the top mobile operating system, with almost 34% of the mobile OS market.

What is even more interesting, however, is that Nokia is also ahead when we look at figures for all the mobile handset vendors. In fact, Nokia is way ahead of Apple, and Android lags far behind.

Read more

Pingdom Podcast #9 – DDoS attacks

Pingdom’s Podcast is a weekly show about Internet, web, security, and mobile stuff.

In this show, we talk mainly about Distributed Denial of Service attacks. Some fresh research shows an increase in smaller, more targeted DDoS attacks, and hacker group Anonymous has vowed to take down the Internet by launching a DDoS attack on the 13 root DNS servers.

Read more

Weekend must-read articles #4

Every Friday we bring you a collection of links to places on the web that we find particularly newsworthy, interesting, entertaining, and topical. We try to focus on some particular area or topic each week, but in general we will cover Internet, web development, networking, performance, and other geeky topics.

This week we bring you a collection of articles focusing on OpenStack.

Read more

By some measures, more than 7 billion people now inhabit the world, and more than a third of us are on the Internet. But how many are added each day, each week, or each minute? We think we have a pretty good idea.

Read on for some pretty amazing numbers.

Read more

Pingdom Podcast #8 – supercomputers

Pingdom’s Podcast is a weekly show about Internet, web, security, and mobile stuff.

In this show, we can finally talk about Saleh’s Carbon for Windows Phone app being available in Windows Marketplace. We also talk to Rich Brueckner of InsideHPC.com about the world of supercomputers.

Read more