Chapter 3Customizing the stylesheets

In many circumstances, the stylesheets can be used “out of the box” without any customization. But sometimes you may need to change the formatting of certain elements. One common reason is to change the formatting of title pages or navigational features. In other cases, it may be to support local extensions to DocBook or simply to change the markup to support a particular use case.

Three approaches are possible, with increasing degrees of effort: changing stylesheet parameters, creating your own customization layer, or making broader changes to the stylesheet’s implementation.

The subject of broader implementation changes is the subject of Chapter 5, Implementation details. In this chapter, we’ll look at the easier options.

3.1Changing stylesheet parameters

The DocBook xslTNG Stylesheets define a lot of parameters. They are all described in IParameter reference. If the change you want to make has already been parameterized, you may be able to achieve your goal simply by setting a parameter at runtime.

For example, if you want to change the formatting of dates and times in date elements, you can simply change the date and time formatting parameters. Similarly, if you want to change the numeration style of ordered lists, you can simply change the ordered list item numeration parameter.

These changes can be accomplished by simply passing the new values to the processor, on the command line or in a configuration file, for example. You do not have to write any XSLT to make these changes.

Parameter values apply to the entire document processed by the stylesheets. In some cases, you may wish to change the presentation of just one or small number of elements. This can often be accomplished with a db processing instruction in the source document itself. These customizations can also be accomplished without writing any XSLT.

If you want to make a change that isn’t supported by a parameter, or an ad hoc exception that doesn’t have a supporting processing instruction, you will have to write a customization layer. (You are invited to submit an issue with your use case if you think it would be of general interest.)

You may also find it convenient to write a customization layer if you want to change several parameters and you find it inconvenient to pass them all to the processor on every invocation.

3.2Creating a customization layer

A customization layer is simply an XSLT stylesheet that you write which extends the DocBook stylesheets. The simplest* customization layer is:

 1 |<?xml version="1.0" encoding="utf-8"?>
   |<xsl:stylesheet
   |    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   |    xmlns:db="http://docbook.org/ns/docbook"
 5 |    xmlns:xs="http://www.w3.org/2001/XMLSchema"
   |    xmlns="http://www.w3.org/1999/xhtml"
   |    exclude-result-prefixes="db xs"
   |    version="3.0">
   | 
10 |<!-- This href has to point to your local copy
   |     of the stylesheets. -->
   |<xsl:import href="docbook/xslt/docbook.xsl"/>
   | 
   |</xsl:stylesheet>

This customization doesn’t do anything. But you can, for example, redefine parameters if you wish:

 1 |<?xml version="1.0" encoding="utf-8"?>
   |<xsl:stylesheet
   |    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   |    xmlns:db="http://docbook.org/ns/docbook"
 5 |    xmlns:xs="http://www.w3.org/2001/XMLSchema"
   |    xmlns="http://www.w3.org/1999/xhtml"
   |    exclude-result-prefixes="db xs"
   |    version="3.0">
   | 
10 |<xsl:import href="docbook/xslt/docbook.xsl"/>
   | 
   |<xsl:param name="orderedlist-item-numeration"
   |           select="'1'"/>
   | 
15 |<xsl:param name="date-dateTime-format"
   |           select="'[D01] [MNn,*-3] [Y0001]
   |                   at [H01]:[m01]'"/>
   | 
   |</xsl:stylesheet>

This will have the effect of running the DocBook stylesheets with those two parameters set as specified.

If you want to change the HTML output for an element, you can write a template for that element in your customization layer. Consider this DocBook document:

 1 |<?xml version="1.0" encoding="utf-8"?>
   |<article xmlns="http://docbook.org/ns/docbook"
   |         version="5.1">
   |<info>
 5 |<title>Sample Document</title>
   |<date>2020-07-05</date>
   |</info>
   | 
   |<para>This is a sample <productname>DocBook</productname>
10 |document.</para>
   | 
   |</article>

Suppose that you decided you wanted to have the productname element link automatically to the vendor webpage.

Important

The DocBook xslTNG Stylesheets process all DocBook elements in the m:docbook mode. This is different from previous XSLT stylesheets for DocBook which simply used the default mode.

You must either specify a default mode in your customization layer or remember to specify the mode on match templates and template applications. If you forget the mode, you’ll get unexpected results!

One way to do that would be to redefine the template that processes the productname element:

 1 |<?xml version="1.0" encoding="utf-8"?>
   |<xsl:stylesheet
   |    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   |    xmlns:db="http://docbook.org/ns/docbook"
 5 |    xmlns:m="http://docbook.org/ns/docbook/modes"  
   |    xmlns:xs="http://www.w3.org/2001/XMLSchema"
   |    xmlns="http://www.w3.org/1999/xhtml"
   |    exclude-result-prefixes="db m xs"  
   |    version="3.0">
10 | 
   |<xsl:import href="docbook/xslt/docbook.xsl"/>
   | 
   |<xsl:param name="orderedlist-item-numeration"
   |           select="'1'"/>
15 | 
   |<xsl:param name="date-dateTime-format"
   |           select="'[D01] [MNn,*-3] [Y0001]
   |                   at [H01]:[m01]'"/>
   | 
20 |<xsl:template match="db:productname"
   |              mode="m:docbook">  
   |  <xsl:variable name="name"
   |                select="normalize-space(.)"/>
   |
25 |  <xsl:variable name="url" as="xs:string?">
   |    <xsl:choose>
   |      <xsl:when test="$name='DocBook'">
   |        <xsl:sequence select="'https://docbook.org/'"/>
   |      </xsl:when>
30 |      <xsl:when test="$name='DocBook xslTNG Stylesheets'">
   |        <xsl:sequence select="'https://xsltng.docbook.org/'"/>
   |      </xsl:when>
   |      <xsl:when test="$name='Wikipedia'">
   |        <xsl:sequence select="'https://wikipedia.org/'"/>
35 |      </xsl:when>
   |      <xsl:otherwise>
   |        <!-- Unrecognized -->
   |      </xsl:otherwise>
   |    </xsl:choose>
40 |  </xsl:variable>
   |                  
   |  <xsl:choose>
   |    <xsl:when test="empty($url)">
   |      <xsl:next-match/>  
45 |    </xsl:when>
   |    <xsl:otherwise>
   |      <a href="{$url}" title="Home page">
   |        <xsl:next-match/>  
   |      </a>
50 |    </xsl:otherwise>
   |  </xsl:choose>
   |</xsl:template>
   | 
   |</xsl:stylesheet>

All of the DocBook elements are processed in the “m:docbook” mode.

Remember to exclude all the namespaces you declare so that they don’t wind up scattered about in your HTML.

I repeat, all of the DocBook elements are processed in the “m:docbook” mode. I expect that failure to declare this mode is going to be a common error.

Yes, this whole listing is rather cramped. I’m trying to make it all narrow enough to fit in the display without making horizontal scrolling necessary.

Calling xsl:next-match invokes the underlying processing. The effect of this template is to wrap an HTML “a” around the default processing for productname.

It’s worth pointing out that if the tag has an xlink:href attribute, that will generate an HTML a as well. A more robust stylesheet would check for that, but I’m trying to keep the example simple.

3.3Managing CSS stylesheets

The HTML that the DocBook xslTNG stylesheets produce is intended to be clean, robust markup for styling with CSS. Exactly how you control which stylesheet links are produced has changed several times. The current scheme is this:

  1. If syntax highlighting is enabled, a link to the $verbatim-syntax-highlight-css stylesheet is included.

  2. If $persistent-toc is true a link to the $persistent-toc-css stylesheet is included.

  3. If $use-docbook-css is true, links to the standard DocBook stylesheets are included. Those stylesheets are docbook.css (for all media), docbook-screen.css (for screen media), and docbook-page-setup.css and docbook-paged.css (for print media).

  4. The DocBook element that is the context element when the HTML head is being generated is processed in the m:html-head-links mode. By default, that template does nothing, but you can change that in a customization layer.

  5. If any CSS stylesheets are defined in $user-css-links, they are included.

  6. The DocBook element that is the context element when the HTML head is being generated is processed in the m:html-head-last mode. By default, that template does nothing, but you can change that in a customization layer.

3.4Customizing title pages

All of the titled elements (books, chapters, sections, etc.) have “title pages”. That is, they have a header element that contains the elements from the info that should be presented in the title header. In practice, info is a wrapper for general metadata about the element and often contains many elements that shouldn’t be presented.

There’s so much variation both in what goes in the info elements and in what users need to have in the title header, that there’s no practical way to control it simply with stylesheet parameters.

Instead the stylesheets offer two customization mechanisms: first, each header is formed from a header template. You can change the titlepage template in your customization layer.

For example, the default titlepage template for chapter headers is:

1 |<header>
  |  <tmp:apply-templates select="db:title">
  |    <h2><tmp:content/></h2>
  |  </tmp:apply-templates>
5 |  <tmp:apply-templates select="db:subtitle">
  |    <h3><tmp:content/></h3>
  |  </tmp:apply-templates>
  |</header>

The tmp:apply-templates elements aren’t as sophisticated as XSLT templates, but they let you select parts of the document. If nothing is selected, the content is ignored. When the title page template is evaluated, the context item is the info element.

Inside the tmp:apply-templates, you can decide what HTML markup should appear. The tmp:content element will be replaced by the result of processing the element or elements you selected.

As a consequence of that template, a chapter title page contains the chapter title in an h2 and the subtitle in an h3. No other elements in the info are presented.

Suppose you are writing a book where each chapter has a different author. You can add the authors to the chapter title page by updating the template in your customization layer. The stylesheets contain a $v:templates variable for this purpose. Any templates that you put inside it will be used before the default templates.

 1 |<xsl:variable name="v:templates" as="document-node()"
   |              xmlns:v="http://docbook.org/ns/docbook/variables">
   |  <xsl:document xmlns:tmp="http://docbook.org/ns/docbook/templates"
   |                xmlns:db="http://docbook.org/ns/docbook"
 5 |                xmlns="http://www.w3.org/1999/xhtml">
   |    <db:chapter context="parent::db:book">
   |      <header>
   |        <tmp:apply-templates select="db:title">
   |          <h2><tmp:content/></h2>
10 |        </tmp:apply-templates>
   |        <tmp:apply-templates select="db:subtitle">
   |          <h3><tmp:content/></h3>
   |        </tmp:apply-templates>
   |        <tmp:apply-templates select="db:author">
15 |          <h3 class="author">
   |            <tmp:content/>
   |          </h3>
   |        </tmp:apply-templates>
   |      </header>
20 |    </db:chapter>
   |  </xsl:document>
   |</xsl:variable>

With that customization, a chapter title page contains the chapter title in an h2 and the subtitle and authors in h3 elements, in that order. If you change the order of the tmp:apply-template elements, the order of the elements in the header will change.

Another common requirement is to put a graphic on the title page of a book. Here’s how you might do that. (This example appears in the test suite as test book.014.xml.) Add the cover to your info element:

 1 |<book xmlns="http://docbook.org/ns/docbook" version="5.2">
   |<info>
   |  <title>Unit Test: book.014</title>
   |  <cover>
 5 |    <mediaobject>
   |      <imageobject>
   |        <imagedata fileref="../media/yoyodyne.png"/>
   |      </imageobject>
   |    </mediaobject>
10 |  </cover>
   |  <editor>
   |    <personname>
   |      <firstname>Norman</firstname>
   |      <surname>Walsh</surname>
15 |    </personname>
   |    <email>ndw@nwalsh.com</email>
   |  </editor>
   |</info>
   |
20 |</book>

Then decide how you want that graphic in the header with a new book template:

 1 |<xsl:variable name="v:templates" as="document-node()"
   |              xmlns:v="http://docbook.org/ns/docbook/variables">
   |  <xsl:document xmlns:tmp="http://docbook.org/ns/docbook/templates"
   |                xmlns:db="http://docbook.org/ns/docbook"
 5 |                xmlns="http://www.w3.org/1999/xhtml">
   |    <db:book>
   |      <header>
   |        <tmp:apply-templates select="db:cover/db:mediaobject">
   |          <div class="cover">
10 |            <tmp:content/>
   |          </div>
   |        </tmp:apply-templates>
   |        <tmp:apply-templates select="db:title">
   |          <h1><tmp:content/></h1>
15 |        </tmp:apply-templates>
   |        <tmp:apply-templates select="db:editor">
   |          <div class="editor">
   |            <h3><tmp:content/></h3>
   |          </div>
20 |        </tmp:apply-templates>
   |      </header>
   |    </db:book>
   |  </xsl:document>
   |</xsl:variable>

This template will output the cover image, then the title, then the editor. If you want to update multiple templates, put them all in the same $v:templates element as siblings.

Once you’ve output the elements in the header, you can use CSS to customize their appearance further.

You can get a long way just by updating the title page templates, but not always far enough. If you can’t achieve what you need by changing a template, you can take full control.

Each element generates its title page with the m:generate-titlepage mode. If you add a template in that mode to your customization layer, it has complete freedom to generate a custom title page.

Here, for example, is a book title page template:

1 |<xsl:template match="db:book" mode="m:generate-titlepage">
  |  <header>
  |    <h1>
  |      <xsl:apply-templates select="db:info/db:title" mode="m:titlepage"/>
5 |    </h1>
  |    <p>Hello, world</p>
  |  </header>
  |</xsl:template>

With this XSLT template in your customization layer, the book title page will consist of the title in an h1 and the phrase “Hello, world” in a paragraph. In this case, the context item for the XSLT template is the main element, not its info child. But it will always have an info child, even if your original document didn’t have a wrapper around the titlepage metadata.

3.5Managing media

References to external media through imagedata, videodata, audiodata, and even textdata can be tricky to manage. On the one hand, it’s most convenient if the URIs in the source documents point to the actual media files. This allows extensions, like the image properties extension function, to access the files. At the same time, the references generated in the HTML have to point to the locations where they will be published. It is often, but not always, the case that the authoring structures and the publishing structures are the same.

The stylesheets are regularly tested against five possible arrangements: three where the media are stored in locations relative to the XML files and two where the media are stored in a separate hierarchy. These are unimaginative named “mo-1”, “mo-2”, “mo-3”, “mo-4”, and “mo-5”. You can find them in the src/test/resources/xml hierarchy in the repository.

mo-1

All of the XML files are in a single directory, the media are in the same hierarchy. Media references in the source use relative URIs to refer to the underlying media: preface.xml refers to the “this is a test” audio clip as media/this-is-a-test.mp3.

mo-2

The XML files are in different directories (this changes the base URI of the media elements). The media are in the same hierarchy. Media references in the source use relative URIs to refer to the underlying media: front/preface.xml refers to the “this is a test” audio clip as ../media/spinning-top.mp4.

mo-3

The XML files are in different directories, but the structure is deeper. This scenario represents the case where there might be multiple books, each with their own media, but also a shared media folder “above” the book hierarchies. The media are in the same hierarchy, but some are “above” the book. Media references in the source use relative URIs to refer to the underlying media: book/front/preface.xml refers to the “this is a test” audio clip as ../../media/spinning-top.mp4.

mo-4

The XML files are still in different directories, but the significant change here is that the media are in their own hierarchy. Media references in the source use URIs relative to the root of that hierarchy: book/front/preface.xml refers to the “this is a test” audio clip as spinning-top.mp4.

mo-5

The XML files are in different directories and the media are in their own hierarchy. What’s different here is that the media hierarchy is further subdivided by media type. Media references in the source use URIs relative to the root of media hierarchy without the media type: book/front/preface.xml still refers to the “this is a test” audio clip as spinning-top.mp4, but this time it is found in media/mp4/spinning-top.mp4 rather than directly in media.

For each arrangement, we look at five possible output structures:

  1. A single HTML document with the media in the same relative locations as the sources.

  2. A single HTML document with the media in a single media subdirectory.

  3. “Chunked” HTML output with the media in the same relative locations as the sources.

  4. “Chunked” HTML output with the media in custom locations. (This is especially tricky for the “mo-5” case because there are two kinds of customization involved.)

  5. “Chunked” HTML output with the media in a single media subdirectory.

The list below gives a brief summary of the parameters used to achieve the desired results for each combination of input and output arrangements.

Note

Remember that in each case, the questions are: can the stylesheets find the media files to query them and are the correct HTML references produced? Actually copying the media files from where they are in the source system to where they need to be in the HTML is “not our problem.”

mo-1, mo-2, and mo-3 / scenario 1

No parameters are needed, this combination works correctly with the defaults.

mo-1, mo-2, and mo-3 / scenario 2
  |mediaobject-output-base-uri = "media/"
  |mediaobject-output-paths = "false"

The output base URI is relative to the “root” of the HTML result. Setting the output paths to “false” removes intermediate hierarchy from the image references.

mo-1, mo-2, and mo-3 / scenario 3
  |chunk = "index.html"
  |chunk-output-base-uri = "/path/to/output/location/"

These parameters aren’t related to media objects, they just tell the stylesheets how and where to “chunk” the output.

mo-1, mo-2, and mo-3 / scenario 4
  |chunk = "index.html"
  |chunk-output-base-uri = "/path/to/output/location/"

This combination is really the same as the previous except that it uses a custom stylesheet with a template in the m:mediaobject-output-adjust mode to add an extra level of hierarchy to the output URIs. This is just an example of arbitrary, custom processing.

mo-1, mo-2, and mo-3 / scenario 5
  |chunk = "index.html"
  |chunk-output-base-uri = "/path/to/output/location/"
  |mediaobject-output-base-uri = "media/"
  |mediaobject-output-paths = "false"

The output base URI is relative to the “root” of the HTML result. Setting the output paths to “false” removes intermediate hierarchy from the image references.

mo-4 / scenario 1
  |mediaobject-input-base-uri = "../media/"

The input base URI will be made absolute relative to the base URI of the input document, so it’s often convenient to specify it as a relative URI. It’s equally possible to specify it as an absolute URI.

mo-4 / scenario 2
  |mediaobject-input-base-uri = "../media/"
  |mediaobject-output-base-uri = "media/"
  |mediaobject-output-paths = "true"

This example has two images with the same name in different directories, so it’s necessary to preserve the output paths.

mo-4 / scenario 3
  |chunk = "index.html"
  |chunk-output-base-uri = "/path/to/output/location/"
  |mediaobject-input-base-uri = "../media/"

This is the combination of chunking and a single media directory.

mo-4 / scenario 4
  |chunk = "index.html"
  |chunk-output-base-uri = "/path/to/output/location/"
  |mediaobject-input-base-uri = "../media/"

This combination is really the same as the previous except that it uses a custom stylesheet with a template in the m:mediaobject-output-adjust mode to add an extra level of hierarchy to the output URIs. This is just an example of arbitrary, custom processing.

mo-4 / scenario 5
  |chunk = "index.html"
  |chunk-output-base-uri = "/path/to/output/location/"
  |mediaobject-input-base-uri = "../media/"
  |mediaobject-output-base-uri = "media/"
  |mediaobject-output-paths = "true"

This is effectively scenario 2 with chunking.

mo-5 / scenarios 1-5

The “mo-5” scenarios are all the same as the “mo-4” scenarios with the addition of one more parameter:

  |mediaobject-grouped-by-type = "true"

In each case, this adds the extra “media object type” level to the URI path.

If you download the source repository, you can see these combinations in action with the build targets “mo_number_test_scenario”, for example, run:

  |./gradlew mo_3_test_2

to see the results of processing “mo-3” in scenario 2. The output will be in the build/actual directory. The build target all_mo_tests will run them all.

3.6Controlling numeration

Numeration refers to the process(es) by which sets, books, divisions, components, sections, and formal objects are numbered. There are three separate aspects to numeration: what’s numbered, where does numbering begin, and does the number inherit from its ancestors.

Consider this book:

 1 |<book>
   |  <title>Book title</title>
   |  <part>
   |    <title>Part title</title>
 5 |    <chapter>
   |      <title>Chapter title</title>
   |      <para></para>
   |    </chapter>
   |  </part>
10 |  <part>
   |    <title>Another part title</title>
   |    <chapter>
   |      <title>Another chapter title</title>
   |      <para></para>
15 |    </chapter>
   |    <chapter>
   |      <title>Yet another chapter title</title>
   |      <para></para>
   |    </chapter>
20 |  </part>
   |</book>

Let’s suppose that parts are numbered “I” and “II”. (The number format is controlled by the localization, see Chapter 4, Localization.) If chapter numbering begins at the book level, those chapters will be numbered “1”, “2”, and “3”. If chapter numbering begins at the division level (the part), those chapters will be numbered “1”, “1”, and “2”. If division numbers are inherited, those numbers will be “I.1”, “II.1”, “II.2”.

In the 1.x versions of these stylesheets, all of the aspects of numeration were controlled by three now obsolete parameters: $component-numbers-inherit, $division-numbers-inherit, and $section-numbers-inherit. In the 2.x stylesheets, the various aspects can be controlled independently and the result is much more consistent, if a bit more complicated.

The default numeration parameters are designed to cover the most common use cases and are specified with strings so that they’re easy to control with parameters. Any numeration scheme can be implemented with a customization layer, but hopefully that will be necessary only rarely and in uncommon cases.

To simplify the problem, we divide the DocBook elements into six categories:

sets

The set is the only member of this category.

books

The book is the only member of this category.

divisions

The divisions elements are part and reference.

components

The component elements are acknowledgements, appendix, article, bibliography, chapter, colophon, dedication, glossary, index, partintro, preface, refentry, and setindex.

sections

The section elements are section, sect1, sect2, sect3, sect4, sect5, simplesect. The refentry section elements are not included because they are not typically numbered.

formal objects

The formal objects are figure, table, example, equation, formalgroup, procedure.

There’s a bit of complexity here. A formalgroup that contains figures counts as a figure, a formalgroup that contains tables counts as a table, etc. An equation or procedure only counts as a formal object if it has a title.

Six parameters control where numbering starts (or restarts): $sets-number-from, $books-number-from, $divisions-number-from, $components-number-from, $sections-number-from, and $formal-objects-number-from. In each case, the value of the parameter must be the name of one of the categories. Sets and books can only number from sets, divisions can number from sets or books, components can number from sets, books, or divisions, etc. It is also possible to specify the value root to indicate that elements in the relevant category are numbered sequentially through the whole document.

To assure consistency, “numbering from” resets when the specified category or one of its ancestors is encountered. In other words, if you’re formatting a set of books and numbering components from divisions, the numbering resets when a new division, book, or set begins.

Six parameters control how numbers are inherited: $sets-inherit-from, $books-inherit-from, $divisions-inherit-from, $components-inherit-from, $sections-inherit-from, and $formal-objects-inherit-from. Like the “number from” parameters, each parameter takes the value of the categories above it. In this case, however, you can specify more than one category.

For example, the default value for formal objects is to inherit from “component section”. That means that the first figure in chapter 2 will be labeled “2.1” and the first figure in the first section in chapter 2 will be labeled “2.1.1”, etc. This most closely reproduces the numbering from the 1.x stylesheets.

3.6.1Numeration overrides

Although the numeration parameters give you complete control over numeration, they aren’t simple to use. A few common cases can be handled with simpler settings: $division-numbers, $component-numbers, and $section-numbers. Each of these parameters is “true” by default and numeration of divisions, components, and sections is handled as described above. If these parameters are set to “false”, divisions, components, and sections, respectively, will not be numbered.

Numbering can also be controlled on a per-element basis with the db processing instruction. If the numbered pseudo-attribute is “true”, the division, component, or section in which that processing instruction occurs, and all of its descendants, will be numbered. If it’s “false”, the element and its descendants will not be numbered.

In this way, it would be possible to have a single chapter or article with numbered sections even in a book where sections are not normally numbered.

3.7Using glossaries

There are essentially two ways to manage glossaries: you can author them by hand, or you can compose them automatically from a collection of glossary entries.

In a glossary authored by hand, no special processing takes place. The entries appear as they are listed and every entry appears whether there is a corresponding glossary reference in the document or not. The author is free to use glossdiv elements to divide the glossary into sections and the document may have multiple glossary elements.

If you compose them from glossary collections, only the terms used in your document (in glossterm or firstterm elements) will appear in the glossary. The glossary collections can be managed internally or externally. If multiple definitions appear in the glossary collections, only the first definition is included.

The best way to explain automatic glossaries is to use an example. Let's assume that you have marked the two terms Apple and Pear as glossterms in your document. Your automatic glossary should ultimately contain exactly two entries, one for each of those terms.

Create a glossary in your document and add auto to the role attribute on the glossary element. (If you’re using automatic glossaries, there should only be one glossary element in your document.) This is the internal glossary.

  • Even if your internal glossary has three entries, one each for Apple, Jackfruit and Pear, you will end up with a glossary in the generated document that has only two entries. There will be no entry for Jackfruit, since there is no corresponding glossterm or firstterm in the main part of your text.

  • You can also use external glossaries for this task, which can be identified by the $glossary-collection parameter, or the db processing instructions with a glossary-collection pseudo-attribute in the root element.

    If you use external glossaries, you can leave the internal, automatic glossary completely empty. As long as there are entries for Apple and Pear in one of your external glossaries, you will end up with those two entries in the generated glossary, even if the external glossaries contain many more terms.

  • You can use the internal, automatic glossary in conjunction with external glossaries. In this case, entries from the internal glossary take precedence over entries for the same term from external glossaries. Lets say you have entries for Apple and Pear among others in your external glossary, and also a glossentry for Apple in your internal glossary. In this case you will end up with a glossary which contains two entries, one for Apple, with the definition taken from the internal glossary, and one for Pear with a definition from the external glossary.

Entries will appear in the glossary in the same order as they appear in the internal and external glossaries unless they are sorted. Sorting is controlled by the $glossary-sort-entries parameter.

An automatic glossary may have glossary divisions. Those are controlled by the $glossary-automatic-divisions parameter.

3.7.1Using Schematron to manage the glossary

Schematron rules can help manage the glossary. The f:glossentries() function (defined in standalone-functions.xsl in the xslTNG install directory) has been designed so that it can be integrated into Schematron independently of the xslTNG stylesheets. You can use it to check whether a corresponding glossentry exists for a glossterm or firstterm while you are still writing. Corresponding Schematron schemas are not yet part of the xslTNG framework. Example 3.1, however, shows how you could use schematron to check whether there is exactly one glossentry for the glossterm and firstterm elements in your document.

 1 |<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
   | 
   |  <xsl:include xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   |    href="standalone-functions.xsl"/>
 5 | 
   |   <ns uri="http://docbook.org/ns/docbook" prefix="db"/>
   |  <ns uri="http://docbook.org/ns/docbook/functions" prefix="f"/>
   | 
   |  <pattern>
10 |    <rule context="
   |        db:firstterm
   |        | db:glossterm[not(ancestor::db:glossary)]">
   |       <let name="term" value="((@baseform, .)[1])"/>
   |      <let name="n" value="count(f:glossentries(.))"/>
15 | 
   |      <report role="error" test="$n eq 0">No entry for 
   |        <value-of select="$term"/> in glossary.</report>
   | 
   |      <report role="warning" test="$n gt 1"><value-of select="$n"/> 
20 |        entries for <value-of select="$term"/> in glossary.</report>
   | 
   |    </rule>
   |  </pattern>
   | 
25 |</schema>

Include the standalone-functions.xsl file. You have to adjust the path accordingly.

Provide declaration for the namespace of functions from xslTNG.

Use the f:glossentries() function to get the number of matching glossentry elements for the given glossterm or firstterm

Example 3.1A Schematron schema to check glossary terms

If you want to use Schematron rules with external glossaries, it’s most convenient to use the db processing instruction to identify the external glossaries. The f:glossentries() function will load them automatically (see Example 3.2).

1 |<article xmlns="http://docbook.org/ns/docbook" version="5.0">
  |  <?db glossary-collection='resources/glosscollection.xml' ?>
  |  <title>My document</title>
  |
5 |</article>
Example 3.2Pass the glossary-collection parameter to Schematron

3.8Using bibliographies

Bibliographies are more complicated than glossaries. Bibliography entries can be “cooked” (bibliomixed) or “raw” (biblioentry) and there’s no obvious way to sort bibliography entries in the general case.

There are also two different cross-referencing mechanisms for bibliographic entries: by ID, using biblioref or xref, or with citation that matches on abbrev elements in the bibliography entry.

Consider this example bibliography:

1 |<bibliography>
  |<bibliomixed xml:id="bib.xml"><abbrev>XML</abbrev>Tim Bray,
  |Jean Paoli, </bibliomixed>
  | 
5 |<biblioentry><abbrev>MalerElAndaloussi96</abbrev>
  |  <title>Developing SGML DTDs</title> </biblioentry>
  |</bibliography>

The first entry can be cited in two ways: <biblioref linkend="bib.xml"/> or <citation>XML</citation>. The second can only be cited with a <citation>MalerElAndaloussi96</citation> (it has no xml:id to link to).

Taking all of these variations into account, there are three ways to construct bibliographies:

  1. Entirely by hand.

  2. Managed by hand, with empty elements as placeholders.

  3. Managed automatically.

There are tradeoffs to each approach.

When external bibliographies are used, they can be identified either with the $bibliography-collection parameter or with db processing instructions with a bibliography-collection pseudo-attribute in the root element.

3.8.1Entirely by hand

In a bibliography constructed entirely by hand, no special processing takes place. The entries appear as they are listed and every entry appears whether it is cited or not. The author is free to use bibliodiv elements to divide the bibliography into sections and the document may have multiple bibliography elements.

3.8.2Managed by hand

Bibliographic entries can be complex and may be shared across multiple documents. One approach to managing this complexity is to store the full entries in an external bibliography and use only placeholders in your actual document.

A placeholder is an empty entry with an xml:id, or an entry that contains only an abbrev (if it has both an xml:id and contains only an abbrev, the ID will be used to search for a matching entry in the external bibliography). It will be replaced by the full entry from the external bibliography when the document is formatted.

For example, if the full entries are available externally, the preceding bibliography example could be shortened to:

1 |<bibliography>
  |<bibliomixed xml:id="bib.xml"/>
  |<biblioentry><abbrev>MalerElAndaloussi96</abbrev>
  |</biblioentry>
5 |</bibliography>

The entries appear in the order listed and every entry appears whether it is cited or not. The author is free to use bibliodiv elements to divide the bibliography into sections and the document may have multiple bibliography elements.

3.8.3Automatic

An automatic bibliography is selected by using the token auto in the role attribute on a bibliography. Placeholders can still be used, but they are unnecessary when using citation for citations.

Any citation that appear in the text will be matched to entries in the external bibliographies. Those entries will be included automatically. Automatically added entries appear at the end of the bibliography (after any internal ones, if the internal bibliography has entries), in the order that they appear in the external bibliographies. If multiple external entries match, only the first is added.

Any bibliography entries in the internal bibliography that aren’t cited will be removed.

When using the automatic style, there should be only one bibliography in the document and it cannot contain divisions.

3.9Creating something completely different

Your input documents go through several pre-processing steps before they are rendered into HTML. If you want to produce completely different outputs, the place to start is with root template in the m:docbook mode.

Consider, for example, the task of creating a JSON version of the Table of Contents. In principle, you could write your own stylesheet to do this, but leveraging the DocBook xslTNG Stylesheets means you can make use of functions like f:generate-id() to create links.

To produce completely different results, override the root template in the m:docbook mode:

1 |<xsl:template match="/" mode="m:docbook">
  |  <xsl:document>
  |    <!-- your processing here -->
  |  </xsl:document>
5 |</xsl:template>

This template must return a document node.

Note that you can mix-and-match your processing with default processing by processing DocBook elements in the m:docbook mode.

Here is a simple example of a stylesheet that produces a JSON version of the Table of Contents for a DocBook document:

 1 |<?xml version="1.0" encoding="utf-8"?>
   |<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   |                xmlns:db="http://docbook.org/ns/docbook"
   |                xmlns:f="http://docbook.org/ns/docbook/functions"
 5 |                xmlns:m="http://docbook.org/ns/docbook/modes"
   |                xmlns:t="http://docbook.org/ns/docbook/templates"
   |                xmlns:xs="http://www.w3.org/2001/XMLSchema"
   |                xmlns="http://www.w3.org/1999/xhtml"
   |                exclude-result-prefixes="db f m t xs"
10 |                version="3.0">
   | 
   |  <!-- This href has to point to your local copy
   |       of the stylesheets. -->
   |  <xsl:import href="docbook/xslt/docbook.xsl"/>
15 | 
   |  <xsl:output method="text"/>
   | 
   |  <!-- Suppress xslTNG's default HTML output; note that this template
   |       must return a document node.  -->
20 |  <xsl:template match="/" mode="m:docbook">
   |    <xsl:document>
   |      <xsl:apply-templates select="." mode="TOC"/>
   |    </xsl:document>
   |  </xsl:template>
25 | 
   |  <!-- The templates below generate a simple JSON ToC. -->
   | 
   |  <xsl:template match="/" mode="TOC">
   |    {"toc": [
30 |    <xsl:apply-templates mode="TOC"/>
   |    ]}
   |  </xsl:template>
   | 
   |  <xsl:template match="db:part|db:article|db:section|db:chapter" mode="TOC"
35 |                expand-text="yes">
   |    <xsl:if test="preceding-sibling::db:part
   |                  | preceding-sibling::db:article
   |                  | preceding-sibling::db:section
   |                  | preceding-sibling::db:chapter">,&#10;</xsl:if>
40 |    {{
   |    "ref": "{f:generate-id(.)}",
   |    "title": "{normalize-space(db:info/db:title)}",
   |    "subtitle": "{normalize-space(db:info/db:subtitle)}",
   |    "items": [
45 |    <xsl:apply-templates select="db:part|db:article|db:section|db:chapter" mode="TOC"/>
   |    ]
   |    }}
   |  </xsl:template>
   | 
50 |  <xsl:template match="*" mode="TOC">
   |    <xsl:apply-templates select="*" mode="TOC"/>
   |  </xsl:template>
   |</xsl:stylesheet>
Note

This example is meant as a starting point; it’s not robust as it only handles a few of the possible elements that might appear in a Table of Contents.

When processing documents this way, be aware that you are transforming the pre-processed, normalized versions of your input documents. For example, whether or not you put info wrappers around the titles of your sections, in the pre-processed input, titles always appear inside info wrappers. This normalization greatly simplifies processing in many places.


Ok, technically, this stylesheet has a couple of namespace references that aren’t strictly necessary so it could be a teeny bit simpler, but you’ll need those declarations (and more!) if you want to do anything useful.