Comet has popularized asynchronous non-blocking HTTP programming, making it practically indistinguishable from reverse Ajax, also known as server push. In this article, Gregor Roth takes a wider view of asynchronous HTTP, explaining its role in developing high-performance HTTP proxies and non-blocking HTTP clients, as well as the long-lived HTTP connections associated with Comet. He also discusses some of the challenges inherent in the current Java Servlet API 2.5 and describes the respective workarounds deployed by two popular servlet containers, Jetty and Tomcat.
While Ajax is a popular solution for dynamically pulling data requests from the server, it does nothing to help us push data to the client. In the case of a Web mail application, for instance, Ajax would enable the client to pull mails from the server, but it would not allow the server to dynamically update the mail client. Comet, also known as server push or reverse Ajax, enhances the Ajax communication pattern by defining an architecture for pushing data from the server to the client. Comet enables us to push an event from the mail server to the WebMail client, which then signals the incoming mail.
Comet itself is based on creating and maintaining long-lived HTTP connections. Handling these connections efficiently requires a new approach to HTTP programming. In this article I introduce asynchronous, non-blocking HTTP programming and explain how it works. While I do present a Comet application at the end of the article, this style of programming is not restricted to Comet applications. Accordingly, this article describes asynchronous, non-blocking HTTP programming in general.
I start with an overview of client-based asynchronous message handling and message streaming, and then begin demonstrating the many uses of asynchronous HTTP on the server side. I explain the role and current limitations of the Java Servlet API 2.5, and demonstrate the use of the
xSocket-http library to work around some of these limitations. The article concludes with a look at a dynamic Web application that leverages the two techniques associated with Comet architectures: long polling and streaming. I also show how this application could be implemented on Jetty and Tomcat, respectively.
Asynchronous message handling
At the message level, asynchronous message handling means that an HTTP client performs a request without waiting for the server response. In contrast, when performing a synchronous call, the caller thread is suspended until the server response returns or a timeout is exceeded. At the application level, code execution is stopped, waiting for the response before further actions can be taken. Client-side synchronous message handling is very easy to understand, as illustrated by the example in Listing 1.
Listing 1. Client example — synchronous call
When performing an asynchronous call it is necessary to define a handler, which will be notified if the response returns. Typically, such a handler will be passed over by performing the call. The call method returns immediately. The application-level code instructions after the send statement will be processed without waiting for a server response. The server response will be handled by performing the handler’s callback method. If the response returns, the network library will execute the callback method within a network-library-controlled thread. If necessary, the request message has to be synchronized with the response message at the application-code level. An asynchronous call is shown in Listing 2.
Listing 2. Client example — asynchronous call
The advantage of this approach is that the caller thread will not be suspended until the response returns. Based on a good network library implementation, no outstanding threads are required. In contrast to the synchronous call approach, the number of outstanding requests is not restricted to the number of possible threads. The synchronous approach requires a dedicated thread for each concurrent request, which consumes a certain amount of memory. This can become a problem if you have many concurrent calls to be performed on the client side.
Asynchronous message handling also enables HTTP pipelining, which you can use to send multiple HTTP requests without waiting for the server response to former requests. The response messages will be returned by the server in the same order as they were sent. Pipelining requires that the underlying HTTP connection is in persistent mode, which is the standard mode with HTTP/1.1. In contrast to non-persistent connections, the persistent HTTP connection stays open after the server has returned a response.
Pipelining can significantly improve application performance when fetching many objects from the same server. The implicit persistent mode eliminates the overhead of establishing a new connection for each new request, by allowing for the reuse of connections. Pipelining also eliminates the need for additional connection instances to perform concurrent requests.
Message content streaming
Asynchronous message handling can improve application performance by avoiding waiting threads, but another performance bottleneck arises when reading the message content.
It is not unusual for an HTTP message to contain kilobytes of content data. On the transport level, such larger messages will be broken down into several TCP segments. The TCP segment size is limited and depends on the underlying network and link layer. For Ethernet-based networks the maximum TCP segment size is up to 1460 bytes.
Bodyless HTTP messages such as
GET requests don’t contain body data. Often the size of such bodyless messages is smaller than 1 kilobyte. Listing 3 shows a simple HTTP request.
Listing 3. HTTP request
GET /html/rfc2616.html HTTP/1.1 Host: tools.ietf.org:80 User-Agent: xSocket-http/2.0-alpha-3
The correlating response of the request shown above contains a message body of 0.5 megabytes. On a personal Internet connection, the response message shown in Listing 4 would be broken into several TCP segments when sent.
Listing 4. HTTP response
Data transfer fragmentation can be hidden at the API level by accessing the body data as a steady and continuous stream. This approach, known as streaming, avoids the need to buffer large chunks of data before processing it. Streaming can also reduce the latency of HTTP calls, especially if both peers support streaming.
Using streaming allows the receiver to start processing the message content before the entire message has been transmitted. Often an HTTP message contains unstructured or semi-structured data, such as HTML pages, video, or music files, which will be processed immediately by the receiving peer. For instance, most browsers start rendering and displaying HTML pages without waiting for the complete page. For this reason most HTTP libraries support a stream-based API to access the message content.
In contrast to the body data, the message header contains well-structured data entries. To access the message header data most HTTP libraries provide dedicated and typed setter and getter methods. In most use cases the header can only be processed after the complete header has been received. The HTTP/1.1 specification doesn’t define the order of the message headers, though it does state that it’s a good practice to send general-header fields first, followed by request-header or response-header fields, and ending with the entity-header fields.
Streaming input data
To process received body data in a streaming manner, the receiving peer has to be notified immediately after the message header has been received. Based on the message header information, the receiver is able to determine the type of the received HTTP message, if body data exists, and which type of content the body contains.
The example code in Listing 5 (below) streams a returned HTML page into a file. The response message data will be processed as soon as it appears. Based on the retrieved body channel the
transferFrom() implementation calls the body channel’s
read() method to transfer the data into the filesystem. This occurs in a blocking manner. If the socket read buffer is empty, the body channel’s
read() method will block until more data is received or the end-of-stream is reached. Blocking the read operation suspends the current caller thread, which can lead to inefficiency in system resource usage.
Listing 5. HTTP message example — blocking input streaming
To process the message body in a non-blocking mode, a handler similar to the one seen in the asynchronous message calling example from Listing 2 can be used. In this case, a non-blocking body channel will be retrieved instead of a blocking channel. In contrast to the blocking channel the non-blocking channel’s read() methods return immediately, whether data has been acquired or not. Notification support is required to avoid repeated, unsuccessful reads within a loop.
The BodyToFileStreamer of the example code in Listing 6 implements such a notification callback method. After retrieving the non-blocking body channel, the body handler will be assigned to the channel. The setDataHandler() call returns immediately. Setting the handler ensures that the body channel checks whether data is already available. If data is available, the handler’s onData() method is run.
The callback method is also called each time body data is available. The network library takes a (pooled) worker thread to perform the callback method. This thread is only assigned to the body channel as long as the callback method is executed. For this reason no outstanding threads are required.
Listing 6. HTTP message example — non-blocking input streaming
Streaming output data
The streaming approach can also be used when sending message data. This avoids buffering large chunks of data. To do this, the message content will be transferred during the method call by using an
InputStream or a
ReadableByteChannel. After writing the message header, the body data will be transferred based on the body stream or channel. Listing 7 is an example of how implicit output streaming works. In this case the output streaming will be managed by the network library. Performing the HTTP call means that the user has to pass over a channel object, which represents the handle of a streamable resource.
Listing 7. Client example — implicit output streaming
In some use cases the output (or body) streaming should be managed by application-level user code. An explicit, user-managed steaming approach requires that the user retrieves an output channel to write the body data. In Listing 8 a message header object is sent instead of a complete message object after the
send() method has been called. This method call responds immediately by returning an output body channel object, which will be used by the application code to write the body data. The message-send procedure finalizes by calling the body channel’s
Listing 8. Client example — user-managed output streaming
Both approaches, streaming input data and streaming output data, will read and write data as soon as it appears. However,streaming doesn’t mean that the data will be read or written directly to the network. All read and write operations work on internal socket buffers. When a write method is called, the operating system kernel transfers the data to the socket’s
sendbuffer. Returning from the write operation just says that the data has been copied to this low-level
send buffer. It doesn’t say that the peer has received the data.
Restrictions of the Java Servlet API 2.5
All of the examples in the previous sections show different ways of handling messages and content on the client side. As you will see later in Listing 10, it is possible to use the same programming style, as well as the same input and output message object representations, on the server side in a very seamless way. When you develop server-side HTTP-based applications, however, you must give consideration to the Java Servlet API.
The Servlet API defines a standard programming approach for handling HTTP requests on the server side. Unfortunately, the current Servlet API 2.5 supports neither non-blocking data streaming nor asynchronous message handling. When you implement a servlet’s service method such as
doGet(), the application-specific servlet code will read the request data, perform the implemented business logic, and return the response. To simplify writing servlets, the Servlet API uses a single-threaded programming approach. The servlet developer doesn’t have to deal with threading issues such as starting or joining threads. Thread management is part of the servlet engine’s responsibilities. Upon receiving an HTTP request the servlet engine uses a (pooled) worker thread to call the servlet’s service method.
The downside of the Servlet API 2.5 is that it only allows for handling messages in a synchronous way. The HTTP request and response object have to be accessed within the scope of the request-handling thread. This message-handling approach is sufficient for most classic use cases. When you begin working with event-driven architectures such as Comet or middleware components such as HTTP proxies, however, asynchronous message handling becomes a very important feature.
When implementing an HTTP proxy, for instance, a request message has to be forwarded, and the response message has to be returned without wasting a request-handling thread for each open call. When you implement an HTTP proxy based on the current Servlet API, each open call requires one worker thread. The number of concurrent proxied connections is restricted to the number of possible worker threads.
Writing a synchronous HTTP proxy
Listing 9 shows an HTTP proxy based on the Servlet API. The servlet’s
doGet() method will be called each time a new
GET request is received. After some proxy-related handling the request will be copied and forwarded using
HttpClient. Upon receiving the correlating response some proxy-related handling will be performed and the
HttpClient response message will be copied to the servlet response message. After leaving the
doGet() method the servlet engine finalizes the response message.
Listing 9. Synchronous proxy example (GET request proxy)
Using the asynchronous
send() method instead of the
call() method won’t help. The current Servlet API doesn’t support writing requests out of the scope of the servlet request-handling thread. In essence, the Servlet 2.5 API is insufficient to write an asynchronous message-handling proxy.
Writing an asynchronous HTTP proxy
Writing an asynchronous message-handling proxy requires using an API other than the Servlet 2.5 specification. The HTTP proxy in Listing 10 is based on the same network library (xSocket-http) used in the previous client-side examples.
xSocket-http is an extension module of the xSocket network library that supports HTTP programming on the server side, as well as the client side. The network library is independent of the Servlet API and does not implement a servlet container.
Whereas the Servlet API uses an
HttpServletResponse object to send a response message, the
xSocket-http network library uses an
HttpResponseContext object. The
xSocket-http network library doesn’t pre-create a response message in an implicit way. Furthermore, in contrast to the Servlet API, neither the request object nor the response-context object is bound to the request-handling thread. Both artifacts can be accessed outside the network’s library-managed threads.
Like the servlet’s
doGet() method, the
onResponse() method will be called each time a request is received. After performing some proxy-related code the received request message will be forwarded using the asynchronous
send() method. This method requires a response handler to handle the received response message. As you saw in Listing 2, using the
send() method avoids the need for outstanding threads.
The most important aspect of this implementation is that the available threads don’t restrict the number of concurrent proxied connections. The scalability of such an asynchronous proxy is only driven by the message-parsing cost and the capability to maintain the required system resources for an open TCP connection in an effective way. Each open TCP connection requires a certain number of socket buffers, control blocks, and file descriptors at the operating-system level.
Listing 10. Asynchronous proxy example
onRequest() methods in Listing 10 will be performed immediately after the message header is received. (The
xSocket-http module also supports an
InvokeOn annotation to specify if the callback method should be performed after receiving the message header or after receiving the complete message).
To reduce the required buffer sizes and to minimize call latency, the message body should be streamed. In Listing 10 this will be done implicitly by the
xSocket-http module. The environment detects that an incomplete received message should be forwarded and registers a non-blocking forward handler on the incoming message-body channel.
Lower buffer sizes are required within the proxy because only parts of the message have to be buffered internally when it is transferred. Furthermore, the latency incurred by forwarding the message could be reduced significantly. If the message is forwarded in a non-streaming manner, first the whole message will be received and buffered before being forwarded. This adds the elapsed time between receiving the first and last message byte to the complete call latency.
Upload data streaming in Java Servlet API 2.5
Streaming is also supported by the current Servlet API, but only in a blocking way. When running the
UploadServlet in Listing 11 in Tomcat (version 6.0.14, default configuration on Windows), the
doPost() method will be called immediately after the request header is received. This allows you to stream the incoming message body. The
UploadServlet reads some message-header entries and streams the message body into a file. If not enough data is available, the
HttpServletRequest‘s input stream
read() method will block. This means the request-handling thread will be suspended until more data is received.
Listing 11. Upload servlet example
Non-blocking upload data streaming
Outstanding threads can be avoided by using non-blocking streams. Listing 12 shows an
UploadRequestHandler, which reads and transfers the incoming message body in a non-blocking way. Similar to the client-side non-blocking streaming example in Listing 6, a non-blocking body channel will be retrieved and a body-data handler will be set. After this operation the
onRequest() method returns immediately, without sending a response message. If body data is received, the body-data handler will be called to transfer the available body data into a file. If the complete body is received, the response message will be sent.