Pingdom Home

US + international: +1-212-796-6890

SE + international: +46-21-480-0920

Business hours 3 am-11:30 am EST (Mon-Fri).

Pingdom Blog

Royal Pingdom

Ramblings from the Pingdom team about the Internet and web tech

RSS Feed

Don’t let Google find your secrets

Google is one of the best hacking tools out there. It may sound incredible, but relatively simple searches in Google and other search engines can dig out sensitive or even dangerous information about your site, your servers and your company.

You want Google to index your site and make you visible and searchable. That part is good for you. But if you have been careless, Google can also index more sensitive information that was never meant to be public, and can therefore be a useful tool for hackers if they want to probe your site for vulnerabilities. This is often called Google hacking. (To be fair, other search engines can be used as well, so it could just as well be called search engine hacking.)

There is a lot of information that can be found on Google thanks to careless (or clueless) administration of websites:

  • Various usernames and passwords (both encrypted and in plain text)
  • Internal documents
  • Internal site statistics
  • Intranet access
  • Database access
  • Mail server access
  • And much, much more

Needless to say, a lot of this information can be used as a starting point for breaking into your systems.

Some examples

There are a huge number of different search strings for sensitive information that have been published around the web. A collection exists at the Google Hacking Database (it currently has 1,423 entries), which is where we found the two following examples.

The image below shows the result of two different searches, one for SQL insert statements that use encryption functions for passwords, and another one for INC files with PHP code in them that contain unencrypted user names, passwords and addresses to the corresponding databases.

Both these searches show a number of results where passwords and usernames have been indexed and cached by Google.

Results of google hacking

Not all of the results will be relevant, but this is just the tip of the iceberg.

How common is it?

We searched for “google hack” in Google Trends, and as you can see in the graph, that search term is becoming increasingly popular.

Google Trends for Google hack

This isn’t all that scientific, but it could give a hint that this kind of information gathering is increasing.

On a small side note, it is also interesting to see that a lot of the searches seem to be coming from Asia, and especially South-East Asia. The top countries for this search term are Indonesia, Vietnam, Malaysia and the Philippines.

Is Google doing anything to prevent this?

Google is not standing idly by. The company seems to proactively try to block some usage of these “hacker searches”, at least via direct search links (which can easily be used by automated applications). Some of the queries we tried gave us this result:

Google 403 forbidden

What can you do to prevent “Google hacks” on your site?

There is no way we can list every single security measure out there, but we hope you will find this to be a useful starting point.

  • First, if at all possible, keep all your sensitive information off the internet. I.e. on storage that isn’t even connected to the internet.
  • Be careful how you write your scripts and access your databases. There are numerous examples where a database access error text shows up on a website and contains way too much information. If you are unlucky and have Google crawl your site at that time, the information is public (and cached by Google).
  • Use robots.txt to let Google know what parts of your website it is ok to index. However, note that even this information can be used by hackers, if you for example specify which parts of the website are “off bounds”, the curious will of course look there to see what is so sensitive if they are targeting your site specifically. And of course, don’t forget that if someone wants to scan your website themselves, without the help of Google, they won’t care about robots.txt and other things that prevent nicely behaved online search robots like Googlebot.
  • Make sure that the directory rights on your web server are in order, i.e. only allow public access to the bare minimum of directories that are necessary for your site to function. This is a precaution that isn’t really specific for Google hacking, but is worth mentioning.
  • Monitor your site for common errors. You can set up monitoring (for example with Pingdom, or another website monitoring service) that checks for text that should not exist on the page, for example part of a php script error message. Then you will know right away if and when your site has “messed up”, and can take the necessary precautions (changing passwords or whatever else may be suitable).
  • “Google hack” your own website. Try out the various searches listed in the Google Hacking Database on your own site.

Find out more

(Google really does have more search tricks than you can throw a stick at.)

Want to test your site every minute?








You will get an email with your login information.

6 Comments

Interesting read. I don’t like Google Desktop blurring the line between public web and private computer desktop.

Funny you should post this article along with that link.

I happened to find this website a week ago and was checking out all those search techniques.

Great in-depth article.

Wow, third or so good post I read – just today – on this blog – *** very useful ***. Please keep up the good stuff.

I’ll share this with my friends …

Thanks a lot,
Vasudev
http://www.dancingbison.com

Nice article, good to see Google is taking an active approach to stopping this.

No news is good news for the Super Bowl website

The New England Patriots held what seemed to be a commanding lead (17-15) with five minutes left of Super Bowl XLVI last night. But the New York Giants came back and managed to win with 21-17.

As exciting as the game sounds, we missed the whole thing, instead spending our time watching the Superbowl.com website.

It turned out to be a rather dull thing to do because the site held up well and there was no downtime at all. The response time also didn’t give away anything significant in terms of online Super Bowl traffic.

Read more

As Super Bowl 46 is approaching, fans will flock to the Lucas Oil Stadium in Indianapolis, Indiana, and to TV sets around the world to follow the New York Giants battle it out with the New England Patriots.

Kickoff is scheduled for 6:30EST on Sunday, February 5, and we’re already monitoring Superbowl.com to see how the site will handle the event.

What team will win Super Bowl 46? How will the site cope? We can only wait to find out.

Read more

Weekend must-read articles #2

Every Friday we bring you a collection of links to places on the web that we find particularly newsworthy, interesting, entertaining, and topical. We try to focus on some particular area or topic each week, but in general we will cover Internet, web development, networking, performance, and other geeky topics.h

This week we bring you a collection of articles focusing on cloud, with a few other topics thrown in to boot.

Read more

Out of the 59 US-based e-commerce sites we monitored during the holiday season last year 28 scored a perfect 100% uptime for December.

Whether this helped spur on the booming sales in the US, we don’t know, but retail e-commerce spending in the US reached $37.2 billion for the November to December 2011 period. That was an increase of 15% from the same period in 2010.

We decided to dig into the numbers for these e-commerce sites to see how well they did in terms of uptime and performance. After massaging the data coming from our Pingdom probes, it turns out that the sites overall performed well during December 2011 in terms of uptime, but response time was an issue for several sites.

Read more

Pingdom Podcast #5

Pingdom’s Mobile Podcast is a weekly show about Internet, web, and mobile stuff.

In this show, Saleh also gives us an update on the pending submission of his Carbon for Windows Phone Twitter client. We’re also joined by Mario Lurig, who talks about using Amazon S3 and Cloudfront to speed up a website.

Read more