This document provides an overview of administration and architecture of web-based services. Most “real” systems administration work involves a fair amount of wrangling HTTP servers, so the core information in this document will likely be useful to most. As a secondary benefit, HTTP provides a good basis to address a number of key service-administration topics and issues, which are relevant to all systems administrators.
There’s more to the Internet and to network services than HTTP and “the web,” and the moment for protesting the conflation of HTTP and “the Internet,” has passed. For whatever its worth, HTTP has become the UNIX pipe of the Internet. Web browsers, of course implement HTTP, but for testing purposes you should familiarize yourself with curl, which is a versatile command line HTTP client (among other protocols.)
There are a few terms with specific meanings in the context of HTTP. Consider these basic concepts:
HTTP transmits metadata regarding content in key/value pairs called headers. Headers are largely arbitrary, though different kinds of clients have different requirements around headers. Use “curl -I” to return a list of headers. See the following headers from tychoish.com:
Server: nginx/0.7.67
Date: Sun, 06 Nov 2011 14:26:07 GMT
Content-Type: text/html
Content-Length: 8720
Last-Modified: Fri, 04 Nov 2011 13:39:02 GMT
Connection: keep-alive
Accept-Ranges: bytes
HTTP also provides for a set of headers in the request. They take the same key/value form as response headers, although the set of keys is different.
When you run “curl -I” the first response line provides the HTTP status, the above example omitted the following status:
HTTP/1.1 200 OK
This conveys the version of the HTTP protocol in use [1] a status code (e.g. 200,) and then some human intelligible translation of this code. You already know code 404 that servers returns when it can’t find the resource requested. 200 is, as above, is he status code for “resource returned successfully.”
There are a few dozen HTTP codes, and while some administrators have memorized the HTTP status codes, there is no need: In general, 300 series codes reflect a redirection (e.g. “the resource you’re looking for is somewhere else,”) 400 series code reflect some sort of error or problem that the server has with the request, and 500 series requests reflect some sort of “internal error,” usually related to the server’s configuration or state.
The following codes are useful to know on sight. Use a reference for everything else:
Code | Meaning |
200 | Everything’s ok. |
301 | Resource Moved Permanently. Update your bookmarks. |
302 | Moved Temporarily. Don’t update your bookmarks. |
400 | Error in request syntax. Client’s fault. |
401 | Authorization required. Use HTTP Auth to return the resource |
403 | Access Denied. Bad HTTP Auth credentials or permissions. |
404 | Resource not found. Typo in the request, or data not found. |
410 | Resource removed from server. This is infrequently used. |
418 | I’m a tea pot. From an April Fools RFC, and socially useful. |
500 | Internal server error. Check configuration and error logs. |
501 | Requires unimplemented feature. Check config and error logs. |
502 | Bad gateway. Check proxy configuration. Upstream server error. |
504 | Gateway timeout. Proxy server not responding. |
Often server logs will return more useful information regarding the state of the system.
[1] | All HTTP services are or attempts to comply with version 1.1. |
HTTP provides a very full featured interface for interacting with remote resources, although in common use, the GET, POST, and PUT request account for the overwhelming majority of requests. Requests types are generally refered to as “methods.” htrict semantic adherence to HTTP methods is one of the defining aspects of “REST“
Conceptually, it’s best to think about HTTP in terms of static content conveyed directly from the server’s file system, thought the web server, to the end-user’s client. In truth, operations are more complex, and most HTTP deployments make use of a clustered server and server side processing.
There are a couple of very simple abstractions that most general purpose web servers provide that make it easier to deploy web services using HTTP: Proxy handling where a server will “pass” a request to another server and URL rewriting where the server will map incoming requests for resources to different internal paths and resources.
Most general purpose web servers have the ability to proxy or forward incoming requests to different HTTP (or CGI, FastCGI or similar) server. Proxying requires minimal overhead on the part of the front end server and makes it possible to host an entire domain or sub-domain using a cluster of machines or have a single public IP address that can access the resources of a group of machines. As a result proxies are essential for scaling web services horizontally. Most deployments of any size use these abstractions to distribute resources. Think of proxying as an instance of partitioning or horizontal scaling.
Load-balancing, then, are proxy configurations where a cluster of servers identical servers provide a single resource. The proxy server in these situations must distribute the requests among the nodes and (optionally) track connections to ensure that nodes remain responsive and in some configurations can ensure that connections from a single client are consistently routed to the same back-end server when possible. Load balancers include the ability to distribute requests unevenly among the node if systems have different capacities as well as different possible responses to node failures. Load balanced architectures are simple examples of replication or vertical scaling.
URL rewriting allows httpd programs to map request strings to actual resources, which may have different names or locations. URL rewriting engines often support regular expression matching, and can provide both transparent rewriting where URLs are rewritten so that the client is unaware, or as “redirections” where the server directs the client to requires multiple requests.
This ability to preset logical URLs to users without affecting the organization of the “back-end” can be incredibly liberating for administrators and developers. Most web development frameworks provide some level of URL abstraction but having this ability in the web server is also helpful. While there are some quirks for each server’s rewriting engine, all systems are roughly similar.
HTTP is really designed to serve static content, and most general-purpose web servers (and browsers) and optimized for this task. Web browsers make multiple requests in parallel to download embeded content (i.e. images, style sheets, JavaScript) “all at once” rather than sequentially.” General purpose HTTPDs are also pretty good at efficiently serving content with this pattern. To serve static content:
Serving static content with HTTP is straightforward, when you need to dynamically assemble content per-request, you must use a more complex system. The kind of dynamic content you require and the kinds of existing applications and tools that you want to use dictate your architecture–to some extent–from here.
CGI, FastCGI, SCGI, PSGI, WSGI, and Rack are all protocols used by web servers to communicate with application servers that generate content dynamically. In summary, users make HTTP requests against a web server (httpd) that passes the request to a separate process, which generates a response that it hands back to the HTTP server that returns the result to the user. While this seems a bit complex, in practice CGI and related protocols have simple designs, robust tool-sets, are commonly understood by many developers, and (generally) provide excellent process/privilege segregation.
There are fundamental differences between these protocols, even though their overall method of operation is similar. Consider the following overview:
While CGI and FastCGI defined dynamic applications from the earliest days of HTTP and the web, the other above mentioned interface methods seem largely emerged in the context of recent web application development frameworks such as Ruby on Rails, Django, and Catalyst.
There are a number of recent application servers that implement HTTP itself instead of some intermediate protocol. These are efficient for serving dynamic content, they’re less efficient for serving static resources and cannot support heavy loads. In production environments these administrators will cluster these servers behind a general purpose httpd that proxies requests back to the application server. In this respect, such servers are operationally similar to FastCGI application servers, but may be easier to for administrators and developers. Examples of these kinds of application servers include:
In contrast to applications servers that operate as web server or use a gateway interface, some general purpose httpd implementations make it possible to embed a programming language within the web server process. Implementations vary by language and by webserver; but are often very powerful at the expense of some idiosyncrasies. For a quite a while, these methods were the prevailing practice for deploying dynamic content.
This practice is most common in context of the Apache HTTPD Server with Perl (and mod_perl) and PHP (mod_php). While there are also Ruby (mod_ruby) and Python (mod_python) implementations of these methods, the community has abandoned these modules in preference for other application deployment methods.
With the exception of mod_php, the embeded interpreters all require you to restart Apache when deploying new code. Additionally, all code run by way of an interpreter embeded in the web server process runs with the permissions of the web server. These operational limitations make this approach less ideal for shared environments.
Even though less popular today than other options, there are still reasons to use these tools for your application in some circumstances. Consider the following:
[2] | In the last couple of years, PHP-FPM has made PHP much easier to run as FastCGI. |
Often web servers are pretty straightforward and have low resource requirements. However, in a number of situations web servers face some scaling challenges, for example: when faced with extraordinary load, when they must generate dynamic content for each request can use significantly greater resources often require additional resources, and when services are critical and downtime is not an option.
In nearly every case, the database layer presents a larger scale-related challenge than the web server tier. See “Database Capacity and Scaling” for more information related to database scaling and architecture.
If HTTP becomes a bottleneck, like databases, there’s a progression of measures that you may take to help ensure that your deployment can deal with the traffic that you expect to face. In general consider the following process:
Begin by: moving the database engine to a separate system, and ensuring that your method of serving dynamic content is finely tuned. In many cases, the default configuration for dynamic content (CGI/Apache/etc.) is poorly tuned. For example, the application server or the httpd has a low maximum connection threshold and the process will review before reaching capacity. Connection timeouts, application timeouts, and approaches to concurrency (threading, forking, event-driven, etc) can all impact performance and you must understand address these problems before taking other approaches.
Then, begin by decoupling HTTP services along logical boundaries, so that it’s easy to increase application capacity and capacity for serving static content separately. If your application or “site,” depends on multiple applications and runtimes, make sure that the services run distinctly, and that all components can run independently and with minimal dependencies.
The key to making sure the site works as a whole is to use a load balancing proxy sever in front of the servers that provide your core application and content. This layer makes it possible for users of your service to have the experience of only using one system when in fact a cluster of systems supplies the service.
Note
Because most application servers are single threaded, it makes sense to run some 2-4 application servers, per system (each on a distinct TCP port.)
Typically, run one application server, or webserver worker process per core.
The number of “worker” processes for the nginx server defaults to 1 for most distributions. As a result, unless you modify the configuration, nginx will only use one processor core.
You may architect systems with this scaling eventuality in mind from the very beginning by using virtual hosting and private networks to separate the application layer from the “front-end” HTTP servers. Once, you’ve segregated application and static HTTP content, it’s a relatively simple matter to cluster specific components and add capacity “horizontally.” As the application layer becomes saturated, deploy more instances of the application service and use the load balancer to distribute load among those nodes, and you can safely repeat this for each component service.
Note
If your system supports or requires a higher standard of availability, it’s also good to keep at least two front-end proxy servers in rotation at any given time, by adding multiple DNS records for a single hostname that point to multiple hosts running identical front-end services.
See also
While availability and scaling are not necessarily linked tasks, consider the material covered in “High(er) Availability Is a Hoax” when thinking about architecture.
Until now, this document has approached HTTP and web servers in generic terms, as if all implementations are equivelent. In truth, there are some significant differences.
To be fair, HTTP is pretty simple, and strictly speaking there’s no need for a big multi-purpose web server. It’s totally possible to use inetd, and a little bit of code (you might as well do this in C) to create your own httpd. Every time a request comes in, inetd spawns a copy of your daemon, which handles the request and process terminates. The HTTP protocol is pretty straightforward, so the server is easy to implement and the binary is pretty small and you can have total control over the behavior of the server by changing some values in a header file (and recompiling, of course.) I’m aware of at least one pretty high traffic site that does things in this manner and it works. Surprisingly well. So that’s an option, but perhaps a bit beyond most mortals.
The remainder of this document, then provides an overview of a process that you might use to chose a HTTP server for your deployment, as well as an overview of contemporary open source HTTP servers.
You should evaluate web servers on a couple of dimensions, including RAM usage, configuration method, compatibility and interoperability with your application servers, and resource utilization under load. In turn:
There are three basic approaches to dealing with concurrent requests used by web servers: forking, threading, and queuing. This means:
It’s difficult to talk about HTTP or even open source and Linux without considering the impact of the Apache httpd. Apache descends from the original httpd developed at the NCSA. The server is highly modular and incredibly flexible, having grown out of a series of “patches” (hence the name from, “a patchy web server”) to the original HTTPD. The Apache project consolidated in the 1990s, and is generally regarded as one of the early technological successes of open source, and likely fueled most early adoption of GNU/Linux systems.
Today, the server itself remains popular and very useful, with most administrators having some level of familiarity with Apache and its configuration. It is well documented supported on many platforms and with many tools as a result of its wide adoption. Apache is incredibly stable and robust as a result of it’s extensive use. At the same time, Apache is not very efficient, particularly in light of recent competitors. While this comparison is frequently offered in analysis, it’s probably the case that it’s overblown. The demands on the vast majority of web servers will never surpass the ability of a well tuned Apache instance on even modest hardware.
nginx (pronounced “engine x”) is my personal favorite web server. It’s simple, functionally complete, uses very little memory, and preforms reliably under all circumstances that I’ve been able to throw at it. The configuration syntax is simple and makes many of the more complicated Apache configurations simple. Many appreciate its and aptitude for serving as an HTTP proxy and software load-balancer, and it can serve as a high-volume Mail proxy for high volume mail servers. The best part of nginx is that it just works.
While there are edge cases where it makes sense to use another server (typically Apache,) and there are some edges that are more rough: plain CGI isn’t particularly smooth, [3] better authentication systems, [4] or a pluggable caching layer, there are few reasons to use something other than nginx.
[3] | Admittedly, the problem is largely with CGI itself. Given an option, I tend to prefer nginx’s externalization of this and it’s configuration of FastCGI processes. |
[4] | nginx only supports basic HTTP authentication. This is a fundamental flaw with HTTP, but support for digest authentication would at the very least be nice. |
Lighttpd (pronounced “lighty”) was one of the first to use the queuing methodology. Because Lighttpd was stable very early and deployed in a number of high profile situations, it became a favorite. In addition to generally efficient operation, it offers a minimalist Lua-based configuration which permits dynamic virtual host configuration and some additional flexibility. Lighttpd, like nginx, supports FastCGI naively, and developed the widely used “spawn-fcgi” tool for starting FastCGI servers.
Unfortunately, development on Lighttpd has stalled and there is a persistent memory-leak issue which forces administrators to restart the server every couple of days. Since late 2009, there has been little reason to use Lighttpd, except if you need the easy virtual host configuration, and can’t find similar functionality in other tools and don’t mind restarting the server arbitrarily for a known problem.
The inclusion of AntiWeb is something of an outlier, but it’s a cool project and it may be a good introduction to HTTP servers for someone interested in the technology at a lower level. AntiWeb is a Common Lisp web server that uses the event-driven/asynchronous approach like Lighttpd or nginx, but it inherits some pretty innovative ideas regarding web development and HTTP from the Lisp world. While you might not use AntiWeb as your next httpd, it’s worth investigation by anyone whose interested in web servers and web applications.
Cherokee views itself as the successor to Apache (hence the name,) and combines Apache’s ease of use with the performance of event-driven servers like nginx. It’s main selling point, is easy configuration, which it accomplishes by way of a web-based interface.