Rate this page

Flattr this

Process an XML document using an XSLT stylesheet in Java

Tested with OpenJDK 6

Debian (Lenny, Squeeze)
Ubuntu (Hardy, Intrepid, Jaunty, Karmic, Lucid, Maverick, Natty, Oneiric, Precise, Quantal)

Tested with OpenJDK 7

Ubuntu (Oneiric, Precise, Quantal)

Objective

To process an XML document using an XSLT stylesheet from within a Java program

Scenario

Suppose that you have an XML document called input.xml that you wish to process using an XSLT stylesheet called style.xsl to produce a new XML document called output.xml.

(The method described here is not limited to handling documents in the form of files, however their use will simplify the task of providing a working example. Other input and output methods are considered later as variations.)

Method

Overview

The method described here has six steps:

  1. Obtain a TransformerFactory object.
  2. Present the stylesheet as a Source.
  3. Use the factory object to obtain a Transformer object for the given stylesheet.
  4. Present the input document as a Source.
  5. Provide a Result object to accept the output document.
  6. Use the Transformer object to process the input document.

Once created the Transformer object can be used to process any number of documents by repeating steps 4 to 6. This can greatly improve processing efficiency if the stylesheet is large and used many times.

The following imports are assumed:

import java.io.*;
import javax.xml.transform.*;
import javax.xml.transform.stream.*;

Obtain a TransformerFactory object

The Transformer object that will be used to process the input document cannot be created directly, and must instead by created by an instance of the factory class TransformerFactory. This in turn must be created using the factory method TransformerFactory.newInstance:

TransformerFactory factory = TransformerFactory.newInstance();

Present the stylesheet as a Source

In order to avoid making assumptions about how or where the stylesheet is stored, the javax.xml.transform package uses an abstraction called a Source. This has implementations which are capable of accepting the stylesheet in a number of different forms, including:

In this instance the stylesheet exists as a file and is therefore in unparsed form. This makes it most readily presentable as a StreamSource:

String stylesheetPathname = "style.xsl";
Source stylesheetSource = new StreamSource(new File(stylesheetPathname).getAbsoluteFile());

(In this particular case it may be possible to pass the pathname directly into the StreamSource constructor where it would be interpreted as a relative URL, however this is not recommended for two reasons: firstly, the pathname could be misinterpreted if it contained any characters that have a special meaning within a URL, and secondly, the javax.xml.transform documentation does not appear to clearly specify how such URLs should be resolved.)

Use the factory object to obtain a Transformer object for the given stylesheet

Given a stylesheet Source and a Transformer factory it is now possible to create the Transformer object:

Transformer transformer = factory.newTransformer(stylesheetSource);

Present the input document as a Source

Like the stylesheet, the input document must be presented as a Source:

String inputPathname = "input.xml";
Source inputSource = new StreamSource(new File(inputPathname).getAbsoluteFile());

Provide a Result object to accept the output document

The output document is represented by an abstraction called a Result. This is the opposite of a Source in that it specifies how and where the output document should be placed. Like a Source it has implementations which are capable of representing the document in a number of different forms, including:

In this instance the requirement is to write the output document to a file, which is most readily achieved using a StreamResult:

String outputPathname = "output.xml";
Result outputResult = new StreamResult(new File(outputPathname).getAbsoluteFile());

Use the Transformer object to transform the input document

Given a Transformer (the stylesheet and XSLT processor), a Source (the input document) and a Result (the output document), it is now possible to perform the transformation:

transformer.transform(inputSource, outputResult);

If no errors are reported then the file output.xml should now contain the result of transforming input.xml using the stylesheet style.xsl.

Example program

Here is a complete example program which takes the names of the stylesheet, input document and output document as command line arguments:

import java.io.*;
import javax.xml.transform.*;
import javax.xml.transform.stream.*;

class Transform {
  public static void main(String[] args) throws TransformerException {
    String stylesheetPathname = args[0];
    String inputPathname = args[1];
    String outputPathname = args[2];

    TransformerFactory factory = TransformerFactory.newInstance();
    Source stylesheetSource = new StreamSource(new File(stylesheetPathname).getAbsoluteFile());
    Transformer transformer = factory.newTransformer(stylesheetSource);
    Source inputSource = new StreamSource(new File(inputPathname).getAbsoluteFile());
    Result outputResult = new StreamResult(new File(outputPathname).getAbsoluteFile());
    transformer.transform(inputSource, outputResult);
  }
}

If written to a file named Transform.java the program can be compiled and invoked as follows:

javac Transform.java
java Transform style.xsl input.xml output.xml

Variations

Read the input from a string

If the input document (or stylesheet) exists in the form of a string containing an unparsed XML document then this can be converted to a StreamSource via the intermediate form of a StringReader:

String inputString = "<?xml version="1.0" encoding="UTF-8"?><foo>";
Source inputSource = new StreamSource(new StringReader(inputString));

(The conversion could be performed via an InputStream, but only by encoding the document as a byte stream then decoding it back to a character stream. In addition to being somewhat inefficient this can corrupt the document if not done correctly.)

Write the output to a string

If the output document is wanted in the form of a string containing an unparsed XML document then that can be arranged by sending it via a StreamResult to a StringWriter:

Writer outputWriter = new StringWriter();
Result outputResult = new StreamResult(outputWriter);
transformer.transform(inputSource, outputResult);
String outputString = outputWriter.toString();

(The conversion could be performed via an OutputStream, but doing so has the same disadvantages as using an InputStream for reading.)

Read the input from a DOM tree

If the input document (or stylesheet) exists in the form of a DOM tree then this can be presented to the Transformer object using a DOMSource:

Source inputSource = new DOMSource(inputNode);

You can optionally specify a second argument which is the base URI with respect to which any relative URIs should be resolved.

Write the output to a DOM tree

If the output document is wanted in the form of a DOM tree then this can be arranged by sending it to a DOMResult:

DOMResult outputResult = new DOMResult();
transformer.transform(inputSource, outputResult);

By default the DOMResult object creates a Document node to hold the output:

Document outputDocument = (Document)outputResult.getNode();
Element outputRootElement = outputDocument.getDocumentElement();

Tags: java | xslt