Synthetic Monitoring

Simulate visitor interaction with your site to monitor the end user experience.

View Product Info

FEATURES

Simulate visitor interaction

Identify bottlenecks and speed up your website.

Learn More

Real User Monitoring

Enhance your site performance with data from actual site visitors

View Product Info

FEATURES

Real user insights in real time

Know how your site or web app is performing with real user insights

Learn More

Infrastructure Monitoring Powered by SolarWinds AppOptics

Instant visibility into servers, virtual hosts, and containerized environments

View Infrastructure Monitoring Info
Comprehensive set of turnkey infrastructure integrations

Including dozens of AWS and Azure services, container orchestrations like Docker and Kubernetes, and more 

Learn More

Application Performance Monitoring Powered by SolarWinds AppOptics

Comprehensive, full-stack visibility, and troubleshooting

View Application Performance Monitoring Info
Complete visibility into application issues

Pinpoint the root cause down to a poor-performing line of code

Learn More

Log Management and Analytics Powered by SolarWinds Loggly

Integrated, cost-effective, hosted, and scalable full-stack, multi-source log management

 View Log Management and Analytics Info
Collect, search, and analyze log data

Quickly jump into the relevant logs to accelerate troubleshooting

Learn More

How Facebook, Twitter and other big sites give back to Open Source

Open SourceBig sites and services like Yahoo, Facebook, Twitter and many others rely heavily on open source software to run their operations. Happily, this isn’t a one-way street. They are also giving back to the open source community, not just by contributing to existing projects, but sometimes by open sourcing their own internal projects, giving back something completely new.

And what these popular sites can contribute is often quite valuable. Since they tend to be very large, they run big operations and have been forced to create solutions for scalability and performance problems that most other sites simply don’t have to deal with.

This article lists a few of those projects, all made free and open source by companies like Facebook, Yahoo, LinkedIn, Twitter, and other big players.

Please note that this is not in any way a complete list of what these companies are contributing to open source.

Cassandra

CassandraCame out of: Facebook

What is it? Cassandra is a “NoSQL” distributed database management system designed to be able to handle data spread out over a very large number of servers. It’s now an Apache project with contributors (and users) like Facebook, Twitter, Rackspace and Digg. It looks like this might be one of the Next Big Things for scaling websites and is becoming a bit of a poster child for the NoSQL fans.

Project homepage: http://incubator.apache.org/cassandra/

HipHop for PHP

HipHop for PHPCame out of: Facebook

What is it? Released as open source as late as last month, HipHop transforms PHP code to C++ and compiles it so it will run faster. Facebook developed it because they use of PHP a lot, and being a scripting language it’s not ideal when it comes to performance. Improving PHP performance quickly adds up to some significant savings for bigger sites because fewer servers can be used to accomplish the same workload. For a site like Facebook which uses tens of thousands of servers, the savings are huge. For example, it lets Facebook’s API handle twice as many requests and still use 30% less CPU compared to before. The average CPU load on Facebook’s web servers has been cut in half.

Project homepage: http://wiki.github.com/facebook/hiphop-php/

Memcached

MemcachedCame out of: LiveJournal

What is it? Memcached is a distributed memory caching system, often used to speed up database-driven websites. It’s used by a TON of sites, for example YouTube, LiveJournal, Wikipedia, Amazon, Facebook, Digg, Twitter, Reddit, and many more. We here at Pingdom use it for our uptime monitoring service.

Project homepage: http://www.memcached.org/

Qizmt

QizmtCame out of: MySpace

What is it? Qizmt is a C# implementation of MapReduce running on Windows. As all MapReduce implementations it’s been designed to support distributed computing of big data sets on a large number of computers (clusters). It’s used internally by MySpace and has been made open source.

Project homepage: http://code.google.com/p/qizmt/

Kestrel

KestrelCame out of: Twitter

What is it? Kestrel is the distributed message queue used by Twitter. It’s based on Twitter’s previous message queue system, Starling, which it is very similar to. Kestrel was actually initially called “Scarling” (Starling ported to Scala).

Project homepage: http://github.com/robey/kestrel

Ruby on Rails

Ruby on RailsCame out of: 37signals

What is it? Ruby on Rails is a web application framework for the Ruby programming language, designed for rapid development. 37signals used it when developing their own apps (Basecamp, etc) but later released it publicly as open source. It’s no exaggeration to say that it’s been a resounding success, although unlike the other projects listed here this doesn’t have much to do with scalability, but rather ease of development.

Ruby on Rails is used for all of 37signals’s own web apps, such as Basecamp, Backpack and Campfire. Hulu, Scribd, Github and many others also use it. Another famous example is Twitter, which was originally a Ruby on Rails app (some of it still is).

Project homepage: http://rubyonrails.org/

Voldemort

VoldemortCame out of: LinkedIn

What is it? Voldemort is a distributed key-value storage system (kind of a very simple database) that LinkedIn has developed internally to handle demanding high scalability storage needs for some of its functionality. It’s a relatively new project.

Project homepage: http://project-voldemort.com/

MediaWiki

MediaWikiCame out of: Wikipedia (Wikimedia)

What is it? MediaWiki is a wiki software developed specifically with Wikipedia in mind, i.e. a very large wiki with a ton of content and users, but in line with Wikipedia in general it’s been made free and open source and is also used for other projects by the Wikimedia Foundation.

Project homepage: http://www.mediawiki.org/

Hadoop

HadoopCame out of: Yahoo (kind of, see below)

What is it? Hadoop is a Java implementation of Mapreduce and is widely used for scalable, distributed computing. The Hadoop project was actually started outside of Yahoo as part of a search engine project called Nutch and programmed by Doug Cutting. Yahoo hired Doug and became the driving force for the continued development of Hadoop, which however has remained an open source project at Apache. Hadoop was named after Doug’s son’s stuffed elephant.

Hadoop is used extensively inside Yahoo and by many other companies as well, for example Facebook, Twitter and Meebo.

Project homepage: http://hadoop.apache.org/

Nginx

nginxCame out of: Rambler (one of Russia’s biggest web portals)

What is it? Nginx is a lightweight, high-performance web server that can also be used as a load balancer and caching server. It was developed by Igor Sysoev for use with Rambler’s services and was designed to be able to handle a huge number of simultaneous connections effectively. Nginx has been gaining popularity rapidly and is used by millions of websites in one capacity or another, including WordPress.com and Hulu. We actually wrote about nginx last week if you want to learn more.

Project homepage: http://nginx.org/

Final words

This article looked at open source projects that stemmed from internal projects at big websites and services. It should be noted (once again) that many of these companies contribute to more projects than are mentioned here above, often in quite significant ways. For example, here is a page that lists Twitter’s open source contributions (don’t miss the wonderfully named Murder, Twitter’s distributed Bittorrent code deployment software), and here’s another one for Facebook.

Photo credits: Partial of kestrel by mugley. Lord Voldemort picture courtesy of Warner Brothers from the movie Harry Potter and the Order of the Phoenix (from Wikipedia).

Webpages Are Getting Larger Every Year, and Here’s Why it Matters

Last updated: February 29, 2024 Average size of a webpage matters because it [...]

A Beginner’s Guide to Using CDNs

Last updated: February 28, 2024 Websites have become larger and more complex [...]

The Five Most Common HTTP Errors According to Google

Last updated: February 28, 2024 Sometimes when you try to visit a web page, [...]

Page Load Time vs. Response Time – What Is the Difference?

Last updated: February 28, 2024 Page load time and response time are key met [...]

Can gzip Compression Really Improve Web Performance?

Last updated: February 26, 2024 The size of the web is slowly growing. Over [...]

Monitor your website’s uptime and performance

With Pingdom's website monitoring you are always the first to know when your site is in trouble, and as a result you are making the Internet faster and more reliable. Nice, huh?

START YOUR FREE 30-DAY TRIAL

MONITOR YOUR WEB APPLICATION PERFORMANCE

Gain availability and performance insights with Pingdom – a comprehensive web application performance and digital experience monitoring tool.

START YOUR FREE 30-DAY TRIAL
Start monitoring for free