Serve web pages using an embedded HTTP server in Java
Content |
Tested with OpenJDK 6 on |
Debian (Lenny, Squeeze) |
Ubuntu (Hardy, Intrepid, Jaunty, Karmic, Lucid, Maverick, Natty, Oneiric, Precise, Quantal) |
Tested with OpenJDK 7 on |
Ubuntu (Oneiric, Precise, Quantal) |
Objective
Serve web pages using an embedded HTTP server in Java
Scenario
Suppose that you are writing a web application in Java. You have decided that for the task in hand a self-contained solution would be preferable to one based on servlets, but you would prefer not to implement a complete HTTP server from scratch.
Method
Overview
An embedded HTTP server can be added to a Java program using classes from the package com.sun.net.httpserver
(added in Java 1.6). The method described here has three steps:
- Construct an HTTP server object.
- Attach one or more HTTP handler objects to the HTTP server object.
- Start the HTTP server.
The following imports are assumed:
import java.io.*; import java.net.*; import com.sun.net.httpserver.*;
Construct an HTTP server object
It is not possible to directly construct an HttpServer
instance because it is an abstract class. Instead there is a factory method called create
:
HttpServer server = HttpServer.create(new InetSocketAddress(80),0);
The first argument is the socket address (IP address and port number) on which the server should listen. In this instance only the port number has been specified. This causes the server to be bound to the wildcard IP address, allowing it to accept connections via any local network interface. For a more secure configuration (which only accepts connections from the local machine) you can instead bind to the loopback address:
HttpServer server = HttpServer.create(new InetSocketAddress(InetAddress.getLoopbackAddress(), 80),0);
(Note that getLoopbackAddress
was added in Java 1.7. For Java 1.6 it is necessary to use a less satisfactory method such as resolving the hostname localhost
.)
The second argument is the backlog of outstanding connections that the operating system should queue while they are waiting to be accepted by the server process. If set to zero then a default value is used, which should be suitable for most purposes.
Attach one or more HTTP handler objects to the HTTP server object
Like any web server, an HttpServer
object needs to be told what content to serve in response to a given URL. This is done by attaching one or more handler objects to the server. Handlers must implement the interface com.sun.net.httpserver.HttpHandler
. This has a method called handle
which the server calls in response to each HTTP request.
The minimum that the handler must do is to:
- send a set of HTTP response headers back to the client by calling
sendResponseHeaders
, - send the body of the HTTP response (assuming that one is needed) to an OutputStream obtained by calling
getResponseBody
, then - close the above output stream to indicate that the response is complete.
class MyHttpHandler implements HttpHandler { public void handle(HttpExchange t) throws IOException { String response = "Hello, World!\n"; t.sendResponseHeaders(200, response.length()); OutputStream os = t.getResponseBody(); os.write(response.getBytes()); os.close(); } }
In practice you will almost certainly want to specify a content type and make the response depend upon the URL.
These and other refinements are discussed later. Note that the OpenJDK implementation automatically adds headers that are required
by the HTTP protocol specification (Connection
and Date
).
The first argument to sendResponseHeaders
is the HTTP status code. A value of 200 indicates that the request was
successful, whereas 404 means that the URL was not found. A list of permitted status codes can be found in the HTTP protocol specification
(RFC 2616).
The second argument to sendResponseHeaders
is the length of the response body in bytes
(after any encoding has taken place).
If this is not known then a value of zero should be given, in which case the HTTP server will not be able to send a
Content-Length
header. Specifying a non-zero value commits you
to writing exactly that number of bytes to the output stream.
If you do not want to send a response body at all then the length should be set to -1. Note that there is a difference between an empty response body and an absent one. The OpenJDK implementation will force the response body to be absent if required by the status code.
To make use of the handler class, it must be instantiated then attached to the HTTP server object:
server.createContext("/", new MyHttpHandler());
The first argument is the path with respect to the HTTP server document root to which the handler object should be attached. In this example it has been attached to the root. The path must be an absolute one so always begins with a forward slash character.
Start the HTTP server
The HTTP server will not accept connections until its start
method is called:
server.start();
Control returns to the caller immediately: the server runs in a background thread.
Should you wish to stop the server this can be done by calling its stop
method.
Variations
Make the response depend on the requested URL
There are three ways to make the response depend on the URL:
- by attaching instances of multiple handler classes,
- by attaching multiple instances of the same handler class, or
- by inspecting the requested URL within the handler method.
If more than one handler matches a given request then the one attached to the longest path will be used. This makes it possible to, for example, provide dedicated handlers for some of the pages of a web site, then cover the remainder with a fallback.
Within a handler the requested URL can be obtained from the HttpExchange
object using its
getRequestURI
method. For normal HTTP requests this contains only the path and perhaps also a query string.
If it is wanted, the authority component (hostname and port) can be obtained using one of the following methods
(in descending order of preference):
- from the requested URL;
- from the
Host
header of the HTTP request; or - from a default that is implicit to the web server.
HTTP/1.1 clients are not normally supposed to include the authority when requesting a URL, but HTTP/1.1 servers must be able to accept
this form. The Host
header is required by HTTP/1.1, but not by HTTP/1.0.
Servers are allowed to disregard the authority component if they only serve one site. Similarly, the query component may be and usually would be disregarded for pages that do not support queries. The fragment identifier (if there was one) should have been removed by the client as it is not relevant to the server.
The path can be extracted from the URI
object using its getPath
method.
Here is an example which echos the requested path back to the client:
public void handle(HttpExchange t) throws IOException { URI uri = t.getRequestURI(); String response = "Path: " + uri.getPath() + "\n"; t.sendResponseHeaders(200, response.length()); OutputStream os = t.getResponseBody(); os.write(response.getBytes()); os.close(); }
Note that you should inspect the URL even when writing a handler for a single page, otherwise (for example) the URL /foo.html
would match /foo.html/bar
when the latter should probably return a 404 error.
Specify a media type
When serving content via HTTP you should specify the type of the data if it is known.
This is done using the Content-Type
header, for example:
Headers h = t.getResponseHeaders(); h.set("Content-Type","text/html");
The value of the header is a Media Type (often called a MIME Type), as defined by RFC 4288. In this instance it indicates that the content is an HTML document. Examples of commonly-used media types include:
application/octet-stream | Unstructured data |
application/xhtml+xml | XHTML document |
image/gif | GIF image |
image/jpeg | JPEG image |
image/png | PNG image |
text/html | HTML document |
text/plain | Unstructured text |
For text-based formats it is permissible (but not required) for the character set to be specified as part of the media type, for example:
Headers h = t.getResponseHeaders(); h.set("Content-Type", "text/html; charset=iso-8859-1");
If part or all of the content type is missing then the user agent is allowed to guess, normally by inspecting the content or by looking for a file type extension at the end of the URL. If in doubt as to the correct choice of media type it may therefore be better to leave it unspecified (or to refrain from specifying the character set) than to risk sending incorrect information.
Consuming the request body
Just as an HTTP response can contain a message body, so too can an HTTP request. This is presented to the handler object in the form of an
InputStream
, which can be obtained from the HttpExchange
instance by calling its
getRequestBody
method.
It is not strictly necessary to do anything with this stream, however the HttpServer
documentation
recommends that you should always fully consume the content of the message body. Here is an example
for the case where you are not expecting a request body, and therefore wish to ignore any content that is presented:
class MyHttpHandler implements HttpHandler { public void handle(HttpExchange t) throws IOException { // Consume request body. InputStream is = t.getRequestBody(); while (is.read() != -1) { is.skip(0x10000); } is.close(); // Send response. String response = "Hello, World!\n"; t.sendResponseHeaders(200, response.length()); OutputStream os = t.getResponseBody(); os.write(response.getBytes()); os.close(); } }
Stream the response body from a file
When serving static content it is often convenient to place the required data in a directory tree that mirrors the structure of the web site. Two points to bear in mind when implementing a handler to do this:
- It is very important that you sanitise the supplied pathname before serving the file, because otherwise the server will be vulnerable to path traversal (an attack method for gaining access to files outside the intended document root). One way to protect against path traversal is to canonicalise the pathname then check whether it contains the document root as a prefix.
- Reading the whole of each file into memory would not be scalable, so it is better to perform the copying in chunks.
Here is an example:
public void handle(HttpExchange t) throws IOException { String root = "/var/www/"; URI uri = t.getRequestURI(); File file = new File(root + uri.getPath()).getCanonicalFile(); if (!file.getPath().startsWith(root)) { // Suspected path traversal attack: reject with 403 error. String response = "403 (Forbidden)\n"; t.sendResponseHeaders(403, response.length()); OutputStream os = t.getResponseBody(); os.write(response.getBytes()); os.close(); } else if (!file.isFile()) { // Object does not exist or is not a file: reject with 404 error. String response = "404 (Not Found)\n"; t.sendResponseHeaders(404, response.length()); OutputStream os = t.getResponseBody(); os.write(response.getBytes()); os.close(); } else { // Object exists and is a file: accept with response code 200. t.sendResponseHeaders(200, 0); OutputStream os = t.getResponseBody(); FileInputStream fs = new FileInputStream(file); final byte[] buffer = new byte[0x10000]; int count = 0; while ((count = fs.read(buffer)) >= 0) { os.write(buffer,0,count); } fs.close(); os.close(); } }
Tags: java