Rate this page

Generate multiple output documents using XSLT

Tested on

Debian (Lenny, Squeeze)
Ubuntu (Lucid)

Objective

To generate multiple output documents from one source document using XSLT

Scenario

Suppose that you have an XML document that you wish to convert to HTML in two different formats:

The following is an example of the XML document format to be processed:

<?xml version="1.0" encoding="UTF-8"?>
<document>
<title>This is an example document</title>
<section>
<title>This is section one</title>
<p>This is the content of section one.</p>
</section>
<section>
<title>This is section two</title>
<p>This is the content of section two.</p>
</section>
</document>

Method

In XSLT2, a new output document can be created using the xsl:result-document instruction:

<xsl:result-document href="foo.html">
 <!-- add instructions to generate document content here -->
</xsl:result-document>

The href attribute is the location to which the document should be written. In principle this could be any URI that the XSLT processor is capable of writing to, but you will most likely want to specify just a filename or a relative pathname (either of which qualify as a relative URI). Where this actually causes the output document to be placed (if anywhere) is at the discretion of the XSLT processor, however it would be reasonable to assume that this can be configured in some way, and the use of a relative URI avoids hard-coding assumptions about the environment into the XSLT stylesheet.

In this instance the number of output documents to be generated depends on the number of sections in the source document. This can be arranged by wrapping the xsl:result-document instruction within a xsl:for-each loop:

<xsl:for-each select="section">
 <xsl:result-document href="section{position()}.html">
  <!-- add instructions to generate document content here -->
 </xsl:result-document>
</xsl:for-each>

Each output document must be written to a different URI, so a method for generating those URIs is needed. Here they have been numbered, so the first is named section1.html, the second section2.html and so on. Alternatives would be to derive it from the section title, or allow it to be specified by the author using an attribute of the section element.

To complete the example it must be made into a stylesheet and instructions must be added to generate the content of each output document:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

<xsl:template match="@*|node()">
 <xsl:copy>
  <xsl:apply-templates select="@*|node()"/>
 </xsl:copy>
</xsl:template>

<xsl:template match="title">
 <h1><xsl:apply-templates/></h1>
</xsl:template>

<xsl:template match="/document">
 <xsl:for-each select="/document/section">
  <xsl:result-document href="section{position()}.html">
   <html>
    <head>
     <title><xsl:value-of select="title"/></title>
    </head>
    <body><xsl:apply-templates/></body>
   </html>
  </xsl:result-document>
 </xsl:for-each>
</xsl:template>
</xsl:stylesheet>

The single-page variant of the document can either be produced using a separate stylesheet or by an additional xsl:apply-templates instruction within the above stylesheet.

Note that the chosen XSLT processor must support XSLT2 (which Saxon-B does, but xsltproc, Xalan and Saxon-6 do not). In the case of Saxon-B it is necessary to use the -ext:on option to enable use of the xsl:result-document element (which is disabled by default for security reasons).

Testing

To apply the stylesheet see Process an XML document using an XSLT stylesheet. Using the example input document listed above, the first of the two resulting output documents should have content equivalent to:

<html>
 <head>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  <title>This is section one</title>
 </head>
 <body>
  <h1>This is section one</h1>
  <p>This is the content of section one.</p>
 </body>
</html>

and similarly for the second document.

Errors

xsl:result-document is disabled when extension functions are disabled

When using Saxon, the error message:

xsl:result-document is disabled when extension functions are disabled

indicates that the processor was invoked without the -ext:on option, which (as noted above) is needed to allow use of the xsl:result-document instruction.

Alternatives

Select a section of the document using a parameter

XSLT1 does not provide a method for generating multiple output documents from a single invocation of the XSLT processor, but it is possible to process the source document more than once and select a different part of it on each occasion. This is likely to be less convenient and less efficient than using xsl:result-document, but may be a useful workaround if there is a need to avoid using XSLT2.

One way to implement this would be to give each section of the document an id attribute, then pass a global parameter into the stylesheet to select the required id. Most command-line XSLT processors provide a means to do this. For xsltproc there is the --stringparam option:

xsltproc --stringparam sectionid section2 style.xsl input.xml

and for Xalan there is the -param option:

xalan -param sectionid section1 -xsl style.xsl -in input.xml

The parameter can then be used within an XPath expression to select the required part of the document:

<xsl:apply-templates select="section[@id=$sectionid]"/>

Combine documents using XInclude

An alternative approach would be to place each section in a separate input document then use XInclude to combine the sections together if a single output document is needed:

<?xml version="1.0" encoding="UTF-8"?>
<document xmlns:xi="http://www.w3.org/2001/XInclude">
 <xi:include href="section1.xml"/>
 <xi:include href="section2.xml"/>
</document>

See also

Further reading

Tags: xslt