How the Apache HTTP web server stays secure (interview)
The Apache HTTP Server is the most common web server software in the world, by far. According to one recent survey, over 420 million web sites run Apache HTTP. With such amazing numbers it’s obvious that we’re curious to find out more about Apache.
We’re big fans of the Apache HTTP web server here at Pingdom, as we should be, because we run our main website Pingdom.com with it. So it should come as no surprise that we grabbed the opportunity to talk to William A. Rowe Jr., until just recently the Vice President, HTTP Server Project at the all-volunteer Apache Software Foundation, with great gusto. He has worked with the Apache Software Foundation for the last 12 years in various positions, including as the Director of the Foundation 2007-2009.
A false assumption that open source is less secure
To start with, we put to Bill, the common argument that since the source code is available openly in open source projects like Apache HTTP Server, it means it’s less secure. In effect, the argument goes, anyone who wants to exploit a system running open source software can look at the code and find out how to break into it. The flip side of that would be that closed source software is somehow inherently more secure because the code is not available for anyone to look at.
“The assumption there is false on the face of it,” Bill answered.
“The least concern that closed source manufacturing companies like Microsoft have is the public disclosure of some of their source code,” he continued. “Of far greater concern is the espionage of source code, or discovery of bugs by pen testing, where they are unaware of that it’s being audited.”
Security auditing of software, he explained, is to a large extent automated today: “Anyone can run these audits, and since they are automated, they can be reproduced,” Bill said.
The Apache HTTP Server user community is invited to do this, automatically scan the code, and raise alerts to particular projects when something is found. This can be something that is a clear vulnerability, or something that is perhaps not a clear vulnerability, but it should be looked at because it can become one.
Bill’s point is, that it’s not a great problem for hackers to do this exact thing against binaries, whether the binaries are produced from closed or open source code. Simply put, he said, the bad guys continue to explore closed source products in the same way they explore open source products.
He added: “Complaints against reverse engineering and de-compiling and such, don’t really mean anything in the security space, and in fact are counter productive to security researchers. Security researchers are trying to mitigate problems. Without source code, there is not the level of transparency there to allow them to work out what mitigations can be applied to avoid the problem in the first place, or what is the actual impact of the flaw they’ve already observed.”
You could perhaps argue that having worked for so long in one of the more high-profiled open source software projects means Bill is biased, but that would be dismissing his point too easily.
His view is that closed source software in fact hampers security researchers in understanding the scope of a vulnerability. How to find a vulnerability is not that different between open and closed source, Bill said. Often it’s simply a process of inventing arbitrary patterns and seeing if you cause unintended consequences.
How Apache works internally with security
We then switched direction a bit and focused on how the Apache Foundation, in general, and the HTTP Project, in particular, work internally with security issues.
At the heart of Apache’s work with security is the ASF Security Team, of which Bill is a member. In the beginning, he said, “the httpd was the the only thing we expected security reports on.” That was something “that changed quickly,” Bill added
The security team has over time grown to about 5 active members, with 10 on the committee ad-hoc, at any given time. “What we are,” Bill explained, “are essentially dispatchers.”
At least for an outsider, the process is seemingly simple: “The team identifies that we actually have something resembling a security report, triage everything that is not a security report, what is instead an inquiry, and pass it to the right destination,” said Bill.
“If we have a zero day coming in from the general public, or an actual reproduction case which is quietly passed to us through another agency, we then take that to that particular [Apache] project, and we become a resource for that project to help them understand how you interact with whomever is reporting this.”
Then, depending on whether the project can reproduce the reported vulnerability, the security team can help put together a response to the security researcher (usually the person or organization reporting the vulnerability). Bill explained: “We say to them, hey, in our next release, we will have a fix, and would you embargo the release until that particular point, and here’s the time frame.”
“We’re not trying to deliberately conceal what our code does, but that leaves us only the ability to commit a fix and simply say we’re fixing a bug, and not draw any particular attention to the fact that there was a security issue in the old code and this is what it looked like.”
According to Bill, there are people who are following commits of major open source projects, like the Linux kernel, httpd, and others, just looking for exploits that might be closing in the near future. “If they can find exploits that we’re working on closing, they want to find that window of opportunity during which they can exploit that particular vulnerability,” explained Bill.
“So all we are is a resource to mitigate confusion and each individual project – all 100 of them or whatnot that we have at any given time – each of them individually are handling their own security issues, whether it is triaging or dispatching them.”
We should also add that the security team tries to help with identifying particular issues that look like specific problem domain issues, like the recent hash vulnerability. The team then looks at what other Apache projects might be implicated, to what degree the team can share the concerned security report with others, even if the immediate vulnerability is being worked on by the project that was most directly affected.
Individuals can choose which itch to scratch
Apache HTTP is by any standard a mature software project, just recently reaching version 2.4 after almost 17 years of existence. We asked Bill to look back over the time he’s been involved with the Apache Foundation and the HTTP project and say whether security is taking up more time now, or not.
His quick and firm reply was: “Certainly in the established projects, and for me personally, yes. The more mature the project is, the more you’re talking about more cosmetic changes, less frequent new feature releases, and transitioning more to a state of maintenance, and some optimization here, and security issues there.”
“And individuals throughout Apache each choose the itches they want to scratch, and by that I mean that all the [code] committers, all the contributors are encouraged to focus on those aspects of the project that are of personal interest to them. In some cases that is what they're being paid to work on, so in some cases that's also what is of interest to their employer or downstream customer."
That means, said Bill, that the people working in the security space on the HTTP project, or any other project, tend to be the people that gravitate toward having a strong interest in security, exploits and maintenance. "They want to simply get those fixes in, and communicate them to the general public," he explained.
“By looking at surveys we get a sense of scale”
As our conversation was wrapping up, the discussion with Bill shifted direction again to Apache’s dominance of the web server software market. Apache HTTP serves just over 65% of all web sites, according to the latest NetCraft web server survey.
We asked if these sorts of surveys and statistics is something that Bill looks at.
"As a foundation, no," he said. "But we do have specific PR folks, that are interested in publicity and marketing. Not from the point of view of a commercial organization, but of course we are interested in making sure Apache and the Apache foundation has a good name, maintains a good name, and we do that by developing good code. But that's only interesting to people who understand we're developing good code," Bill chuckled.
Out of that large installed base, version 2 of Apache HTTP accounts for 92.2%. More specifically, a survey from the end of February, showed that the most common version of Apache HTTP was 2.2 with 89.2%.
These are numbers that Bill is more interest in, rather than the percentage market share. He said he looks mostly at upgrade cycles and lags in upgrades: "I'm looking at the survey for February and I can see that 2.2.3 is still widely adopted and this code is five years old by now," he said.
"What we're looking at there is Red Hat or other core operating system distributions, which puts out a major release, and folks install it and don't really want to change it. And from a security point of view, those 2.2.3s aren't particularly vulnerable, because they've had a series of incremental patches applied to them," Bill explained.
What Bill will look at in the coming month or two is the upgrade and downgrade pattern. He will study how version 2.4 is adopted and then what percentage of those who upgrade will revert to an earlier version after a period of time.
"I've done this for many years now and it helps me understand people's expectations. So, what did they expect out of the new release that is not working as they though it would,” he explained.
So, even though Bill doesn’t necessarily look at surveys to see whether Apache is still number one, he looks at them to get a sense of where the users are in terms of versions adopted. He said: "We get a lot of that feedback directly from users, but those are only isolated cases. By looking at surveys we get a sense of scale.”
“We've watched others come and go”
Finally, the discussion with Bill turned to directly addressing Apache HTTP’s competition.
Even though Apache has tightened its grip of the number one web server software spot, that doesn’t mean that the competitors are sitting still. For example, NGINX is very close to overtaking Microsoft IIS as the second most used web server software.
"I'm not really concerned with the percentages involved there, and whether it's up a bit from one month to the next,” Bill explained.
He added: “We've watched others come and go, and that's a testament to exactly what open source provides. We've seen the coming and going of the Sun Solaris web server, we've seen the coming and going of Netscape, and various others.”
Bill’s view is that there would always be new players in the web server space. He even hopes that is the case, as it spurs on competition, with new projects trying out new things, without the constraints of existing projects.
With regards to NGINX, Bill added that he’s excited about it becoming popular: “So, NGINX can come in, present something, and say ‘we're not going to be the flexible end-all of web servers that Apache is, we're going to focus on specific problems’, and I expect them to do quite well,” he said.
"I'm very pleased that the project I've been involved with for ten years still has a commanding position in the web server space, but my primary concern is, does the project I'm involved with satisfy the needs of a significant group of people,” Bill finished.
Apache HTTP web server goes from strength to strength
It's clear that the Apache HTTP server is the dominating force in the web server space, even though its 20th birthday is only a few years away. And that's obvious not just from the market share Apache enjoys in surveys like that of NetCraft, but in the mind share of developers and anyone else working with the Internet and networking.
We would like to thank Bill Rowe for so graciously taking the time to talk to us, and wish him and the rest of the Apache team the best of luck for the future. With millions of websites around the world depending on the software that these guys develop and maintain, we know that many of you will join us in saying "keep up the great work" to everyone at Apache.
Part of top picture via Shutterstock.