Synthetic Monitoring

Simulate visitor interaction with your site to monitor the end user experience.

View Product Info

FEATURES

Simulate visitor interaction

Identify bottlenecks and speed up your website.

Learn More

Real User Monitoring

Enhance your site performance with data from actual site visitors

View Product Info

FEATURES

Real user insights in real time

Know how your site or web app is performing with real user insights

Learn More

Infrastructure Monitoring Powered by SolarWinds AppOptics

Instant visibility into servers, virtual hosts, and containerized environments

View Infrastructure Monitoring Info
Comprehensive set of turnkey infrastructure integrations

Including dozens of AWS and Azure services, container orchestrations like Docker and Kubernetes, and more 

Learn More

Application Performance Monitoring Powered by SolarWinds AppOptics

Comprehensive, full-stack visibility, and troubleshooting

View Application Performance Monitoring Info
Complete visibility into application issues

Pinpoint the root cause down to a poor-performing line of code

Learn More

Log Management and Analytics Powered by SolarWinds Loggly

Integrated, cost-effective, hosted, and scalable full-stack, multi-source log management

 View Log Management and Analytics Info
Collect, search, and analyze log data

Quickly jump into the relevant logs to accelerate troubleshooting

Learn More

Wanted: Hard drive boys for our new ginormous data center

In November, Google wrote in their official blog that they had done an experiment where they had sorted 1 PB (1,000 TB) of data with MapReduce. The information about the sorting itself was impressive, but one thing that stuck in our minds was the following (emphasis added by us):

An interesting question came up while running experiments at such a scale: Where do you put 1PB of sorted data? We were writing it to 48,000 hard drives (we did not use the full capacity of these disks, though), and every time we ran our sort, at least one of our disks managed to break (this is not surprising at all given the duration of the test, the number of disks involved, and the expected lifetime of hard disks).

Each of these sorting runs that Google did lasted six hours. So that would mean that hard drives would be breaking at least 4 times a day for every 48,000 hard drives that a data center is using.

Interesting, isn’t it? We have discussed this several times around the office here at Pingdom. Data centers are getting huge. How many hard drives are there in one of these new, extremely large data centers? 100,000? 200,000? More?

Add to this the “cloud computing” trend. Since we store more and more data online, data centers will have to keep adding more data storage capacity all the time to be able accommodate their customers.

To name an example of how enormous some of these new data centers are, Microsoft has stated that it will have 300,000 servers in a new data center they are building in Chicago. We don’t know how many hard drives that will result in for storage, but we imagine that it will be many.

So, let’s assume we have one huge data center with 200,000 hard drives. At least 16 hard drives would break every day. With 400,000 hard drives, one hard drive would break every 45 minutes. (Ok, perhaps we’re getting carried away here, but you get the idea.)

Does this mean that these huge data centers will basically have a dedicated “hard drive fixer” running around replacing broken hard drives?

Is the “cloud computing era” ushering in a new data center profession? Hard drive boys? 🙂

Maybe this is already happening?

Questions for those in the know…

So, if this is already the situation, or will be in the near future, at least in “mega data centers”, what would be the best way to handle this? Would you organize your data center with this in mind, keeping all storage in close proximity to avoid having to walk all over the place? And what about the containerized data centers that for example Microsoft is building? Would you have to visit each separate container to deal with the problems as they arise?

Introduction to Observability

These days, systems and applications evolve at a rapid pace. This makes analyzi [...]

Webpages Are Getting Larger Every Year, and Here’s Why it Matters

Last updated: February 29, 2024 Average size of a webpage matters because it [...]

A Beginner’s Guide to Using CDNs

Last updated: February 28, 2024 Websites have become larger and more complex [...]

The Five Most Common HTTP Errors According to Google

Last updated: February 28, 2024 Sometimes when you try to visit a web page, [...]

Page Load Time vs. Response Time – What Is the Difference?

Last updated: February 28, 2024 Page load time and response time are key met [...]

Monitor your website’s uptime and performance

With Pingdom's website monitoring you are always the first to know when your site is in trouble, and as a result you are making the Internet faster and more reliable. Nice, huh?

START YOUR FREE 30-DAY TRIAL

MONITOR YOUR WEB APPLICATION PERFORMANCE

Gain availability and performance insights with Pingdom – a comprehensive web application performance and digital experience monitoring tool.

START YOUR FREE 30-DAY TRIAL
Start monitoring for free