Generate multiple output documents using XSLT
Content |
Tested on |
Debian (Lenny, Squeeze) |
Ubuntu (Lucid) |
Objective
To generate multiple output documents from one source document using XSLT
Scenario
Suppose that you have an XML document that you wish to convert to HTML in two different formats:
- as a single page of HTML for the whole document, or
- as multiple HTML pages, one for each section of the document.
The following is an example of the XML document format to be processed:
<?xml version="1.0" encoding="UTF-8"?> <document> <title>This is an example document</title> <section> <title>This is section one</title> <p>This is the content of section one.</p> </section> <section> <title>This is section two</title> <p>This is the content of section two.</p> </section> </document>
Method
In XSLT2, a new output document can be created using the xsl:result-document
instruction:
<xsl:result-document href="foo.html"> <!-- add instructions to generate document content here --> </xsl:result-document>
The href
attribute is the location to which the document should be written. In principle this could be any URI that the XSLT processor is capable of writing to, but you will most likely want to specify just a filename or a relative pathname (either of which qualify as a relative URI). Where this actually causes the output document to be placed (if anywhere) is at the discretion of the XSLT processor, however it would be reasonable to assume that this can be configured in some way, and the use of a relative URI avoids hard-coding assumptions about the environment into the XSLT stylesheet.
In this instance the number of output documents to be generated depends on the number of sections in the source document. This can be arranged by wrapping the xsl:result-document
instruction within a xsl:for-each
loop:
<xsl:for-each select="section"> <xsl:result-document href="section{position()}.html"> <!-- add instructions to generate document content here --> </xsl:result-document> </xsl:for-each>
Each output document must be written to a different URI, so a method for generating those URIs is needed. Here they have been numbered, so the first is named section1.html
, the second section2.html
and so on. Alternatives would be to derive it from the section title, or allow it to be specified by the author using an attribute of the section
element.
To complete the example it must be made into a stylesheet and instructions must be added to generate the content of each output document:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="title"> <h1><xsl:apply-templates/></h1> </xsl:template> <xsl:template match="/document"> <xsl:for-each select="/document/section"> <xsl:result-document href="section{position()}.html"> <html> <head> <title><xsl:value-of select="title"/></title> </head> <body><xsl:apply-templates/></body> </html> </xsl:result-document> </xsl:for-each> </xsl:template> </xsl:stylesheet>
The single-page variant of the document can either be produced using a separate stylesheet or by an additional xsl:apply-templates
instruction within the above stylesheet.
Note that the chosen XSLT processor must support XSLT2 (which Saxon-B does, but xsltproc, Xalan and Saxon-6 do not). In the case of Saxon-B it is necessary to use the -ext:on
option to enable use of the xsl:result-document
element (which is disabled by default for security reasons).
Testing
To apply the stylesheet see Process an XML document using an XSLT stylesheet. Using the example input document listed above, the first of the two resulting output documents should have content equivalent to:
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>This is section one</title> </head> <body> <h1>This is section one</h1> <p>This is the content of section one.</p> </body> </html>
and similarly for the second document.
Errors
xsl:result-document is disabled when extension functions are disabled
When using Saxon, the error message:
xsl:result-document is disabled when extension functions are disabled
indicates that the processor was invoked without the -ext:on
option, which (as noted above) is needed to allow use of the xsl:result-document
instruction.
Alternatives
Select a section of the document using a parameter
XSLT1 does not provide a method for generating multiple output documents from a single invocation of the XSLT processor, but it is possible to process the source document more than once and select a different part of it on each occasion. This is likely to be less convenient and less efficient than using xsl:result-document
, but may be a useful workaround if there is a need to avoid using XSLT2.
One way to implement this would be to give each section of the document an id
attribute, then pass a global parameter into the stylesheet to select the required id
. Most command-line XSLT processors provide a means to do this. For xsltproc there is the --stringparam
option:
xsltproc --stringparam sectionid section2 style.xsl input.xml
and for Xalan there is the -param
option:
xalan -param sectionid section1 -xsl style.xsl -in input.xml
The parameter can then be used within an XPath expression to select the required part of the document:
<xsl:apply-templates select="section[@id=$sectionid]"/>
Combine documents using XInclude
An alternative approach would be to place each section in a separate input document then use XInclude to combine the sections together if a single output document is needed:
<?xml version="1.0" encoding="UTF-8"?> <document xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include href="section1.xml"/> <xi:include href="section2.xml"/> </document>
See also
Further reading
- XSL Transformations (XSLT), Version 2.0, W3C, January 2007
Tags: xslt