Chapter 3. Customizing the stylesheets
In many circumstances, the stylesheets can be used “out of the box” without any customization. But sometimes you may need to change the formatting of certain elements. One common reason is to change the formatting of title pages or navigational features. In other cases, it may be to support local extensions to DocBook or simply to change the markup to support a particular use case.
Three approaches are possible, with increasing degrees of effort: changing stylesheet parameters, creating your own customization layer, or making broader changes to the stylesheet’s implementation.
The subject of broader implementation changes is the subject of Chapter 5, Implementation details. In this chapter, we’ll look at the easier options.
3.1. Changing stylesheet parameters
The DocBook xslTNG Stylesheets define a lot of parameters. They are all described in I. Parameter reference. If the change you want to make has already been parameterized, you may be able to achieve your goal simply by setting a parameter at runtime.
For example, if you want to change the formatting of dates and times
in date
elements, you can simply change the
date and time formatting
parameters. Similarly, if you want to change the numeration style of
ordered lists, you can simply change the ordered list item
numeration parameter.
These changes can be accomplished by simply passing the new values to the processor, on the command line or in a configuration file, for example. You do not have to write any XSLT to make these changes.
Parameter values apply to the entire document processed by the stylesheets. In some cases, you may wish to change the presentation of just one or small number of elements. This can often be accomplished with a db processing instruction in the source document itself. These customizations can also be accomplished without writing any XSLT.
If you want to make a change that isn’t supported by a parameter, or an ad hoc exception that doesn’t have a supporting processing instruction, you will have to write a customization layer. (You are invited to submit an issue with your use case if you think it would be of general interest.)
You may also find it convenient to write a customization layer if you want to change several parameters and you find it inconvenient to pass them all to the processor on every invocation.
3.2. Creating a customization layer
A customization layer is simply an XSLT stylesheet that you write which extends the DocBook stylesheets. The simplest* customization layer is:
1 |<?xml version="1.0" encoding="utf-8"?>
|<xsl:stylesheet
| xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
| xmlns:db="http://docbook.org/ns/docbook"
5 | xmlns:xs="http://www.w3.org/2001/XMLSchema"
| xmlns="http://www.w3.org/1999/xhtml"
| exclude-result-prefixes="db xs"
| version="3.0">
|
10 |<!-- This href has to point to your local copy
| of the stylesheets. -->
|<xsl:import href="docbook/xslt/docbook.xsl"/>
|
|</xsl:stylesheet>
This customization doesn’t do anything. But you can, for example, redefine parameters if you wish:
1 |<?xml version="1.0" encoding="utf-8"?>
|<xsl:stylesheet
| xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
| xmlns:db="http://docbook.org/ns/docbook"
5 | xmlns:xs="http://www.w3.org/2001/XMLSchema"
| xmlns="http://www.w3.org/1999/xhtml"
| exclude-result-prefixes="db xs"
| version="3.0">
|
10 |<xsl:import href="docbook/xslt/docbook.xsl"/>
|
|<xsl:param name="orderedlist-item-numeration"
| select="'1'"/>
|
15 |<xsl:param name="date-dateTime-format"
| select="'[D01] [MNn,*-3] [Y0001]
| at [H01]:[m01]'"/>
|
|</xsl:stylesheet>
This will have the effect of running the DocBook stylesheets with those two parameters set as specified.
If you want to change the HTML output for an element, you can write a template for that element in your customization layer. Consider this DocBook document:
1 |<?xml version="1.0" encoding="utf-8"?>
|<article xmlns="http://docbook.org/ns/docbook"
| version="5.1">
|<info>
5 |<title>Sample Document</title>
|<date>2020-07-05</date>
|</info>
|
|<para>This is a sample <productname>DocBook</productname>
10 |document.</para>
|
|</article>
Suppose that you decided you wanted to have the
productname
element link automatically to the vendor
webpage.
The DocBook xslTNG Stylesheets process
all DocBook elements in the
m:docbook
mode. This is different from previous XSLT stylesheets for DocBook
which simply used the default mode.
You must either specify a default mode in your customization layer or remember to specify the mode on match templates and template applications. If you forget the mode, you’ll get unexpected results!
One way to do that would be to redefine the template that processes the
productname
element:
1 |<?xml version="1.0" encoding="utf-8"?>
|<xsl:stylesheet
| xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
| xmlns:db="http://docbook.org/ns/docbook"
5 | xmlns:m="http://docbook.org/ns/docbook/modes" ①
| xmlns:xs="http://www.w3.org/2001/XMLSchema"
| xmlns="http://www.w3.org/1999/xhtml"
| exclude-result-prefixes="db m xs" ②
| version="3.0">
10 |
|<xsl:import href="docbook/xslt/docbook.xsl"/>
|
|<xsl:param name="orderedlist-item-numeration"
| select="'1'"/>
15 |
|<xsl:param name="date-dateTime-format"
| select="'[D01] [MNn,*-3] [Y0001]
| at [H01]:[m01]'"/>
|
20 |<xsl:template match="db:productname"
| mode="m:docbook"> ③
| <xsl:variable name="name"
| select="normalize-space(.)"/>
|④
25 | <xsl:variable name="url" as="xs:string?">
| <xsl:choose>
| <xsl:when test="$name='DocBook'">
| <xsl:sequence select="'https://docbook.org/'"/>
| </xsl:when>
30 | <xsl:when test="$name='DocBook xslTNG Stylesheets'">
| <xsl:sequence select="'https://xsltng.docbook.org/'"/>
| </xsl:when>
| <xsl:when test="$name='Wikipedia'">
| <xsl:sequence select="'https://wikipedia.org/'"/>
35 | </xsl:when>
| <xsl:otherwise>
| <!-- Unrecognized -->
| </xsl:otherwise>
| </xsl:choose>
40 | </xsl:variable>
|
| <xsl:choose>
| <xsl:when test="empty($url)">
| <xsl:next-match/> ⑤
45 | </xsl:when>
| <xsl:otherwise>
| <a href="{$url}" title="Home page">
| <xsl:next-match/> ⑤
| </a>
50 | </xsl:otherwise>
| </xsl:choose>
|</xsl:template>
|
|</xsl:stylesheet>
- ①
All of the DocBook elements are processed in the “
m:docbook
” mode.- ②
Remember to exclude all the namespaces you declare so that they don’t wind up scattered about in your HTML.
- ③
I repeat, all of the DocBook elements are processed in the “
m:docbook
” mode. I expect that failure to declare this mode is going to be a common error.- ④
Yes, this whole listing is rather cramped. I’m trying to make it all narrow enough to fit in the display without making horizontal scrolling necessary.
- ⑤
Calling
xsl:next-match
invokes the underlying processing. The effect of this template is to wrap an HTML “a
” around the default processing forproductname
.
It’s worth pointing out that if the tag has an
xlink:href
attribute, that will generate an HTML
a
as well. A more robust stylesheet would check for that,
but I’m trying to keep the example simple.
3.3. Managing CSS stylesheets
The HTML that the DocBook xslTNG stylesheets produce is intended to be clean, robust markup for styling with CSS. Exactly how you control which stylesheet links are produced has changed several times. The current scheme is this:
If syntax highlighting is enabled, a link to the
$verbatim-syntax-highlight-css
stylesheet is included.If
$persistent-toc
is true a link to the$persistent-toc-css
stylesheet is included.If
$use-docbook-css
is true, links to the standard DocBook stylesheets are included. Those stylesheets aredocbook.css
(for all media),docbook-screen.css
(for screen media), anddocbook-page-setup.css
anddocbook-paged.css
(for print media).The DocBook element that is the context element when the HTML
head
is being generated is processed in them:html-head-links
mode. By default, that template does nothing, but you can change that in a customization layer.If any CSS stylesheets are defined in
$user-css-links
, they are included.The DocBook element that is the context element when the HTML
head
is being generated is processed in them:html-head-last
mode. By default, that template does nothing, but you can change that in a customization layer.
3.4. Customizing title pages
All of the titled elements (books, chapters, sections, etc.) have “title pages”. That is,
they have a
header
element that contains the
elements from the info
that should be presented in the title header.
In practice, info
is a wrapper for general metadata about the element and
often contains many elements that shouldn’t be presented.
There’s so much variation both in what goes in the
info
elements and in what users need to have in the title header, that
there’s no practical way to control it simply with stylesheet parameters.
Instead the stylesheets offer two customization mechanisms: first, each header is formed from a header template. You can change the titlepage template in your customization layer.
For example, the default titlepage template for chapter headers is:
1 |<header>
| <tmp:apply-templates select="db:title">
| <h2><tmp:content/></h2>
| </tmp:apply-templates>
5 | <tmp:apply-templates select="db:subtitle">
| <h3><tmp:content/></h3>
| </tmp:apply-templates>
|</header>
The tmp:apply-templates
elements aren’t as sophisticated as
XSLT templates, but they let you select parts of the document. If nothing is
selected, the content is ignored. When the title page template is evaluated, the context
item is the info
element.
Inside the tmp:apply-templates
, you can decide what HTML markup should
appear. The
tmp:content
element will be replaced by the result of processing
the element or elements you selected.
As a consequence of that template, a chapter title page contains the
chapter title in an h2
and
the subtitle in an h3
. No
other elements in the info
are presented.
Suppose you are writing a book where each chapter has a different author.
You can add the authors to the chapter title page by updating the template in
your customization layer. The stylesheets contain a
$v:templates
variable for this purpose. Any templates that you
put inside it will be used before the default templates.
1 |<xsl:variable name="v:templates" as="document-node()"
| xmlns:v="http://docbook.org/ns/docbook/variables">
| <xsl:document xmlns:tmp="http://docbook.org/ns/docbook/templates"
| xmlns:db="http://docbook.org/ns/docbook"
5 | xmlns="http://www.w3.org/1999/xhtml">
| <db:chapter context="parent::db:book">
| <header>
| <tmp:apply-templates select="db:title">
| <h2><tmp:content/></h2>
10 | </tmp:apply-templates>
| <tmp:apply-templates select="db:subtitle">
| <h3><tmp:content/></h3>
| </tmp:apply-templates>
| <tmp:apply-templates select="db:author">
15 | <h3 class="author">
| <tmp:content/>
| </h3>
| </tmp:apply-templates>
| </header>
20 | </db:chapter>
| </xsl:document>
|</xsl:variable>
With that customization, a chapter title page contains the
chapter title in an h2
and the subtitle and authors in
h3
elements, in that order.
If you change the order of the tmp:apply-template
elements, the
order of the elements in the header will change.
Another common requirement is to put a graphic on the title page of a
book. Here’s how you might do that. (This example appears in the test suite
as test book.014.xml
.) Add the cover
to your
info
element:
1 |<book xmlns="http://docbook.org/ns/docbook" version="5.2">
|<info>
| <title>Unit Test: book.014</title>
| <cover>
5 | <mediaobject>
| <imageobject>
| <imagedata fileref="../media/yoyodyne.png"/>
| </imageobject>
| </mediaobject>
10 | </cover>
| <editor>
| <personname>
| <firstname>Norman</firstname>
| <surname>Walsh</surname>
15 | </personname>
| <email>ndw@nwalsh.com</email>
| </editor>
|</info>
|…
20 |</book>
Then decide how you want that graphic in the header with a new book template:
1 |<xsl:variable name="v:templates" as="document-node()"
| xmlns:v="http://docbook.org/ns/docbook/variables">
| <xsl:document xmlns:tmp="http://docbook.org/ns/docbook/templates"
| xmlns:db="http://docbook.org/ns/docbook"
5 | xmlns="http://www.w3.org/1999/xhtml">
| <db:book>
| <header>
| <tmp:apply-templates select="db:cover/db:mediaobject">
| <div class="cover">
10 | <tmp:content/>
| </div>
| </tmp:apply-templates>
| <tmp:apply-templates select="db:title">
| <h1><tmp:content/></h1>
15 | </tmp:apply-templates>
| <tmp:apply-templates select="db:editor">
| <div class="editor">
| <h3><tmp:content/></h3>
| </div>
20 | </tmp:apply-templates>
| </header>
| </db:book>
| </xsl:document>
|</xsl:variable>
This template will output the cover image, then the title, then the editor.
If you want to update multiple templates, put them all in the same
$v:templates
element as siblings.
Once you’ve output the elements in the header, you can use CSS to customize their appearance further.
You can get a long way just by updating the title page templates, but not always far enough. If you can’t achieve what you need by changing a template, you can take full control.
Each element generates its title page with the m:generate-titlepage
mode. If you add a template in that mode to your customization layer, it has
complete freedom to generate a custom title page.
Here, for example, is a book title page template:
1 |<xsl:template match="db:book" mode="m:generate-titlepage">
| <header>
| <h1>
| <xsl:apply-templates select="db:info/db:title" mode="m:titlepage"/>
5 | </h1>
| <p>Hello, world</p>
| </header>
|</xsl:template>
With this XSLT template in your customization layer, the book title page
will consist of the title in an
h1
and the phrase “Hello, world” in a paragraph.
In this case, the context item for the XSLT template is the main element, not
its info
child. But it will always have an
info
child, even if your original document didn’t have a wrapper
around the titlepage metadata.
3.5. Managing media
References to external media through imagedata
,
videodata
, audiodata
, and even
textdata
can be tricky to manage. On the one hand, it’s
most convenient if the URIs in the source documents point to the
actual media files. This allows extensions, like the image properties
extension function, to access the
files. At the same time, the references generated in the HTML have to
point to the locations where they will be published. It is often, but
not always, the case that the authoring structures and the publishing
structures are the same.
The stylesheets are regularly tested against five possible
arrangements: three where the media are stored in locations relative
to the XML files and two where the media are stored in a separate
hierarchy. These are unimaginative named “mo-1”, “mo-2”, “mo-3”, “mo-4”,
and “mo-5”.
You can find them in the
src/test/resources/xml
hierarchy in the repository.
- mo-1
All of the XML files are in a single directory, the media are in the same hierarchy. Media references in the source use relative URIs to refer to the underlying media:
preface.xml
refers to the “this is a test” audio clip asmedia/this-is-a-test.mp3
.- mo-2
The XML files are in different directories (this changes the base URI of the media elements). The media are in the same hierarchy. Media references in the source use relative URIs to refer to the underlying media:
front/preface.xml
refers to the “this is a test” audio clip as../media/spinning-top.mp4
.- mo-3
The XML files are in different directories, but the structure is deeper. This scenario represents the case where there might be multiple books, each with their own media, but also a shared media folder “above” the book hierarchies. The media are in the same hierarchy, but some are “above” the book. Media references in the source use relative URIs to refer to the underlying media:
book/front/preface.xml
refers to the “this is a test” audio clip as../../media/spinning-top.mp4
.- mo-4
The XML files are still in different directories, but the significant change here is that the media are in their own hierarchy. Media references in the source use URIs relative to the root of that hierarchy:
book/front/preface.xml
refers to the “this is a test” audio clip asspinning-top.mp4
.- mo-5
The XML files are in different directories and the media are in their own hierarchy. What’s different here is that the media hierarchy is further subdivided by media type. Media references in the source use URIs relative to the root of media hierarchy without the media type:
book/front/preface.xml
still refers to the “this is a test” audio clip asspinning-top.mp4
, but this time it is found inmedia/mp4/spinning-top.mp4
rather than directly inmedia
.
For each arrangement, we look at five possible output structures:
A single HTML document with the media in the same relative locations as the sources.
A single HTML document with the media in a single
media
subdirectory.“Chunked” HTML output with the media in the same relative locations as the sources.
“Chunked” HTML output with the media in custom locations. (This is especially tricky for the “mo-5” case because there are two kinds of customization involved.)
“Chunked” HTML output with the media in a single
media
subdirectory.
The list below gives a brief summary of the parameters used to achieve the desired results for each combination of input and output arrangements.
Remember that in each case, the questions are: can the stylesheets find the media files to query them and are the correct HTML references produced? Actually copying the media files from where they are in the source system to where they need to be in the HTML is “not our problem.”
- mo-1, mo-2, and mo-3 / scenario 1
No parameters are needed, this combination works correctly with the defaults.
- mo-1, mo-2, and mo-3 / scenario 2
mediaobject-output-base-uri = "media/"
mediaobject-output-paths = "false"
The output base URI is relative to the “root” of the HTML result. Setting the output paths to “false” removes intermediate hierarchy from the image references.
- mo-1, mo-2, and mo-3 / scenario 3
chunk = "index.html"
chunk-output-base-uri = "/path/to/output/location/"
These parameters aren’t related to media objects, they just tell the stylesheets how and where to “chunk” the output.
- mo-1, mo-2, and mo-3 / scenario 4
chunk = "index.html"
chunk-output-base-uri = "/path/to/output/location/"
This combination is really the same as the previous except that it uses a custom stylesheet with a template in the
m:mediaobject-output-adjust
mode to add an extra level of hierarchy to the output URIs. This is just an example of arbitrary, custom processing.- mo-1, mo-2, and mo-3 / scenario 5
chunk = "index.html"
chunk-output-base-uri = "/path/to/output/location/"
mediaobject-output-base-uri = "media/"
mediaobject-output-paths = "false"
The output base URI is relative to the “root” of the HTML result. Setting the output paths to “false” removes intermediate hierarchy from the image references.
- mo-4 / scenario 1
mediaobject-input-base-uri = "../media/"
The input base URI will be made absolute relative to the base URI of the input document, so it’s often convenient to specify it as a relative URI. It’s equally possible to specify it as an absolute URI.
- mo-4 / scenario 2
mediaobject-input-base-uri = "../media/"
mediaobject-output-base-uri = "media/"
mediaobject-output-paths = "true"
This example has two images with the same name in different directories, so it’s necessary to preserve the output paths.
- mo-4 / scenario 3
chunk = "index.html"
chunk-output-base-uri = "/path/to/output/location/"
mediaobject-input-base-uri = "../media/"
This is the combination of chunking and a single media directory.
- mo-4 / scenario 4
chunk = "index.html"
chunk-output-base-uri = "/path/to/output/location/"
mediaobject-input-base-uri = "../media/"
This combination is really the same as the previous except that it uses a custom stylesheet with a template in the
m:mediaobject-output-adjust
mode to add an extra level of hierarchy to the output URIs. This is just an example of arbitrary, custom processing.- mo-4 / scenario 5
chunk = "index.html"
chunk-output-base-uri = "/path/to/output/location/"
mediaobject-input-base-uri = "../media/"
mediaobject-output-base-uri = "media/"
mediaobject-output-paths = "true"
This is effectively scenario 2 with chunking.
- mo-5 / scenarios 1-5
The “mo-5” scenarios are all the same as the “mo-4” scenarios with the addition of one more parameter:
mediaobject-grouped-by-type = "true"
In each case, this adds the extra “media object type” level to the URI path.
If you download the source repository, you can see these combinations
in action with the build targets
“mo_number_test_scenario
”,
for example, run:
./gradlew mo_3_test_2
to see the results of processing “mo-3” in scenario 2. The output
will be in the build/actual
directory. The build target
all_mo_tests
will run them all.
3.6. Controlling numeration
Numeration refers to the process(es) by which sets, books, divisions, components, sections, and formal objects are numbered. There are three separate aspects to numeration: what’s numbered, where does numbering begin, and does the number inherit from its ancestors.
Consider this book:
1 |<book>
| <title>Book title</title>
| <part>
| <title>Part title</title>
5 | <chapter>
| <title>Chapter title</title>
| <para>…</para>
| </chapter>
| </part>
10 | <part>
| <title>Another part title</title>
| <chapter>
| <title>Another chapter title</title>
| <para>…</para>
15 | </chapter>
| <chapter>
| <title>Yet another chapter title</title>
| <para>…</para>
| </chapter>
20 | </part>
|</book>
Let’s suppose that parts are numbered “I” and “II”. (The number
format is controlled by the localization, see Chapter 4, Localization.) If chapter
numbering begins at the book level, those chapters will be
numbered “1”, “2”, and “3”. If chapter numbering begins at the division level
(the part
), those chapters will be numbered “1”, “1”, and “2”.
If division numbers are inherited, those numbers will be “I.1”, “II.1”, “II.2”.
In the 1.x versions of these stylesheets, all of the aspects of
numeration were controlled by three now obsolete parameters:
$component-numbers-inherit
,
$division-numbers-inherit
, and
$section-numbers-inherit
. In the 2.x stylesheets, the
various aspects can be controlled independently and the result is
much more consistent, if a bit more complicated.
The default numeration parameters are designed to cover the most common use cases and are specified with strings so that they’re easy to control with parameters. Any numeration scheme can be implemented with a customization layer, but hopefully that will be necessary only rarely and in uncommon cases.
To simplify the problem, we divide the DocBook elements into six categories:
- sets
The
set
is the only member of this category.- books
The
book
is the only member of this category.- divisions
The divisions elements are
part
andreference
.- components
The component elements are
acknowledgements
,appendix
,article
,bibliography
,chapter
,colophon
,dedication
,glossary
,index
,partintro
,preface
,refentry
, andsetindex
.- sections
The section elements are
section
,sect1
,sect2
,sect3
,sect4
,sect5
,simplesect
. Therefentry
section elements are not included because they are not typically numbered.- formal objects
The formal objects are
figure
,table
,example
,equation
,formalgroup
,procedure
.There’s a bit of complexity here. A
formalgroup
that contains figures counts as afigure
, aformalgroup
that contains tables counts as atable
, etc. Anequation
orprocedure
only counts as a formal object if it has a title.
Six parameters control where numbering starts (or restarts):
$sets-number-from
,
$books-number-from
,
$divisions-number-from
,
$components-number-from
,
$sections-number-from
, and
$formal-objects-number-from
. In each case, the value
of the parameter must be the name of one of the categories. Sets and books
can only number from sets, divisions can number from sets or books,
components can number from sets, books, or divisions, etc. It is also
possible to specify the value root
to indicate that elements in
the relevant category are numbered sequentially through the whole document.
To assure consistency, “numbering from” resets when the specified category or one of its ancestors is encountered. In other words, if you’re formatting a set of books and numbering components from divisions, the numbering resets when a new division, book, or set begins.
Six parameters control how numbers are inherited:
$sets-inherit-from
,
$books-inherit-from
,
$divisions-inherit-from
,
$components-inherit-from
,
$sections-inherit-from
, and
$formal-objects-inherit-from
. Like the “number
from” parameters, each parameter takes the value of the categories
above it. In this case, however, you can specify more than one
category.
For example, the default value for formal objects is to
inherit from “component section
”. That means that the
first figure in chapter 2 will be labeled “2.1” and the first figure
in the first section in chapter 2 will be labeled “2.1.1”, etc.
This most closely reproduces the numbering from the 1.x stylesheets.
3.6.1. Numeration overrides
Although the numeration parameters give you complete control over
numeration, they aren’t simple to use. A few common cases can be handled with
simpler settings: $division-numbers
,
$component-numbers
, and
$section-numbers
. Each of these parameters is “true” by
default and numeration of divisions, components, and sections is handled as
described above. If these parameters are set to “false”, divisions, components,
and sections, respectively, will not be numbered.
Numbering can also be controlled on a per-element basis with the
db processing instruction.
If the numbered
pseudo-attribute is “true”, the division, component, or section in which that
processing instruction occurs, and all of its descendants, will be numbered.
If it’s “false”, the element and its descendants will not be numbered.
In this way, it would be possible to have a single chapter or article with numbered sections even in a book where sections are not normally numbered.
3.7. Using glossaries
There are essentially two ways to manage glossaries: you can author them by hand, or you can compose them automatically from a collection of glossary entries.
In a glossary authored by hand, no special processing
takes place. The entries appear as they are listed and every entry appears
whether there is a corresponding glossary reference in the document or not.
The author is free to use glossdiv
elements
to divide the glossary into sections and the document may have multiple
glossary
elements.
If you compose them from glossary collections, only the terms
used in your document (in glossterm
or firstterm
elements) will appear in the glossary. The glossary collections can be
managed internally or externally. If multiple definitions appear in
the glossary collections, only the first definition is
included.
The best way to explain automatic glossaries is to use an
example. Let's assume that you have marked the two terms
Apple and
Pear as glossterm
s in your document.
Your automatic glossary should ultimately contain exactly two entries,
one for each of those terms.
Create a glossary in your document and add auto
to the
role
attribute on the glossary
element.
(If you’re using automatic glossaries, there should only be one glossary
element in your document.) This is the internal glossary.
Even if your internal glossary has three entries, one each for Apple, Jackfruit and Pear, you will end up with a glossary in the generated document that has only two entries. There will be no entry for Jackfruit, since there is no corresponding
glossterm
orfirstterm
in the main part of your text.You can also use external glossaries for this task, which can be identified by the
$glossary-collection
parameter, or thedb
processing instructions with aglossary-collection
pseudo-attribute in the root element.If you use external glossaries, you can leave the internal, automatic glossary completely empty. As long as there are entries for Apple and Pear in one of your external glossaries, you will end up with those two entries in the generated glossary, even if the external glossaries contain many more terms.
You can use the internal, automatic glossary in conjunction with external glossaries. In this case, entries from the internal glossary take precedence over entries for the same term from external glossaries. Lets say you have entries for Apple and Pear among others in your external glossary, and also a
glossentry
for Apple in your internal glossary. In this case you will end up with a glossary which contains two entries, one for Apple, with the definition taken from the internal glossary, and one for Pear with a definition from the external glossary.
Entries will appear in the glossary in the same order as they appear in the
internal and external glossaries unless they are sorted. Sorting is controlled
by the $glossary-sort-entries
parameter.
An automatic glossary may have glossary divisions. Those are
controlled by the $glossary-automatic-divisions
parameter.
3.7.1. Using Schematron to manage the glossary
Schematron rules can help manage the glossary. The
f:glossentries()
function (defined in
standalone-functions.xsl
in the xslTNG install
directory) has been designed so that it can be integrated into
Schematron independently of the xslTNG stylesheets. You can use it to
check whether a corresponding glossentry
exists for a
glossterm
or firstterm
while you are still
writing. Corresponding Schematron schemas are not yet part of the
xslTNG framework. Example 3.1, however,
shows how you could use schematron to check whether there is exactly one glossentry
for the glossterm
and firstterm
elements in your document.
1 |<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
||
① <xsl:include xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|href="standalone-functions.xsl"/>
5 ||
<ns uri="http://docbook.org/ns/docbook" prefix="db"/>
|② <ns uri="http://docbook.org/ns/docbook/functions" prefix="f"/>
||
<pattern>
10 |<rule context="
|db:firstterm
|| db:glossterm[not(ancestor::db:glossary)]">
|<let name="term" value="((@baseform, .)[1])"/>
|③ <let name="n" value="count(f:glossentries(.))"/>
15 ||
<report role="error" test="$n eq 0">No entry for
|<value-of select="$term"/> in glossary.</report>
||
<report role="warning" test="$n gt 1"><value-of select="$n"/>
20 |entries for <value-of select="$term"/> in glossary.</report>
||
</rule>
|</pattern>
|25 |
</schema>
- ①
Include the
standalone-functions.xsl
file. You have to adjust the path accordingly.- ②
Provide declaration for the namespace of functions from xslTNG.
- ③
Use the
f:glossentries()
function to get the number of matchingglossentry
elements for the givenglossterm
orfirstterm
If you want to use Schematron rules with external glossaries,
it’s most convenient to use the db
processing instruction
to identify the external glossaries. The
f:glossentries()
function will load them automatically (see Example 3.2).
1 |<article xmlns="http://docbook.org/ns/docbook" version="5.0">
|<?db glossary-collection='resources/glosscollection.xml' ?>
|<title>My document</title>
|…
5 |</article>
glossary-collection
parameter to Schematron3.8. Using bibliographies
Bibliographies are more complicated than glossaries.
Bibliography entries can be “cooked” (bibliomixed
) or “raw”
(biblioentry
) and there’s no obvious way to sort
bibliography entries in the general case.
There are also two different cross-referencing mechanisms for
bibliographic entries: by ID, using biblioref
or xref
,
or with citation
that matches on abbrev
elements
in the bibliography entry.
Consider this example bibliography:
1 |<bibliography>
|<bibliomixed xml:id="bib.xml"><abbrev>XML</abbrev>Tim Bray,
|Jean Paoli, …</bibliomixed>
|
5 |<biblioentry><abbrev>MalerElAndaloussi96</abbrev>
| <title>Developing SGML DTDs</title> …</biblioentry>
|</bibliography>
The first entry can be cited in two ways: <biblioref linkend="bib.xml"/>
or <citation>XML</citation>
. The second can only be cited with
a <citation>MalerElAndaloussi96</citation>
(it has no xml:id
to link to).
Taking all of these variations into account, there are three ways to construct bibliographies:
Managed by hand, with empty elements as placeholders.
There are tradeoffs to each approach.
When external bibliographies are used, they can be identified either with the
$bibliography-collection
parameter or with
db
processing instructions with a
bibliography-collection
pseudo-attribute in the root element.
3.8.1. Entirely by hand
In a bibliography constructed entirely by hand, no special processing
takes place. The entries appear as they are listed and every entry appears
whether it is cited or not. The author is free to use bibliodiv
elements
to divide the bibliography into sections and the document may have multiple
bibliography
elements.
3.8.2. Managed by hand
Bibliographic entries can be complex and may be shared across multiple documents. One approach to managing this complexity is to store the full entries in an external bibliography and use only placeholders in your actual document.
A placeholder is an empty entry with an xml:id
, or an entry that contains only an
abbrev
(if it has both an xml:id
and contains only an abbrev
, the
ID will be used to search for a matching entry in the external bibliography).
It will be replaced by the full entry from the external bibliography
when the document is formatted.
For example, if the full entries are available externally, the preceding bibliography example could be shortened to:
1 |<bibliography>
|<bibliomixed xml:id="bib.xml"/>
|<biblioentry><abbrev>MalerElAndaloussi96</abbrev>
|</biblioentry>
5 |</bibliography>
The entries appear in the order listed and every entry appears
whether it is cited or not. The author is free to use bibliodiv
elements
to divide the bibliography into sections and the document may have multiple
bibliography
elements.
3.8.3. Automatic
An automatic bibliography is selected by using the token
auto
in the role
attribute on a bibliography
.
Placeholders can still be used, but they are unnecessary when using
citation
for citations.
Any citation that appear in the text will be matched to entries in the external bibliographies. Those entries will be included automatically. Automatically added entries appear at the end of the bibliography (after any internal ones, if the internal bibliography has entries), in the order that they appear in the external bibliographies. If multiple external entries match, only the first is added.
Any bibliography entries in the internal bibliography that aren’t cited will be removed.
When using the automatic style, there should be only one
bibliography
in the document and it cannot contain
divisions.
3.9. Creating something completely different
Your input documents go through several pre-processing steps
before they are rendered into HTML. If you want to produce completely
different outputs, the place to start is with root template in the
m:docbook
mode.
Consider, for example,
the task of creating a JSON version of the Table of Contents. In principle, you could
write your own stylesheet to do this, but leveraging the
DocBook xslTNG Stylesheets means you can make use of functions like
f:generate-id()
to create links.
To produce completely different results, override the root template in the
m:docbook
mode:
1 |<xsl:template match="/" mode="m:docbook">
| <xsl:document>
| <!-- your processing here -->
| </xsl:document>
5 |</xsl:template>
This template must return a document node.
Note that you can mix-and-match your processing with default
processing by processing DocBook elements in the
m:docbook
mode.
Here is a simple example of a stylesheet that produces a JSON version of the Table of Contents for a DocBook document:
1 |<?xml version="1.0" encoding="utf-8"?>
|<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
|xmlns:db="http://docbook.org/ns/docbook"
|xmlns:f="http://docbook.org/ns/docbook/functions"
5 |xmlns:m="http://docbook.org/ns/docbook/modes"
|xmlns:t="http://docbook.org/ns/docbook/templates"
|xmlns:xs="http://www.w3.org/2001/XMLSchema"
|xmlns="http://www.w3.org/1999/xhtml"
|exclude-result-prefixes="db f m t xs"
10 |version="3.0">
||
<!-- This href has to point to your local copy
|of the stylesheets. -->
|<xsl:import href="docbook/xslt/docbook.xsl"/>
15 ||
<xsl:output method="text"/>
||
<!-- Suppress xslTNG's default HTML output; note that this template
|must return a document node. -->
20 |<xsl:template match="/" mode="m:docbook">
|<xsl:document>
|<xsl:apply-templates select="." mode="TOC"/>
|</xsl:document>
|</xsl:template>
25 ||
<!-- The templates below generate a simple JSON ToC. -->
||
<xsl:template match="/" mode="TOC">
|{"toc": [
30 |<xsl:apply-templates mode="TOC"/>
|]}
|</xsl:template>
||
<xsl:template match="db:part|db:article|db:section|db:chapter" mode="TOC"
35 |expand-text="yes">
|<xsl:if test="preceding-sibling::db:part
|| preceding-sibling::db:article
|| preceding-sibling::db:section
|| preceding-sibling::db:chapter">, </xsl:if>
40 |{{
|"ref": "{f:generate-id(.)}",
|"title": "{normalize-space(db:info/db:title)}",
|"subtitle": "{normalize-space(db:info/db:subtitle)}",
|"items": [
45 |<xsl:apply-templates select="db:part|db:article|db:section|db:chapter" mode="TOC"/>
|]
|}}
|</xsl:template>
|50 |
<xsl:template match="*" mode="TOC">
|<xsl:apply-templates select="*" mode="TOC"/>
|</xsl:template>
|</xsl:stylesheet>
This example is meant as a starting point; it’s not robust as it only handles a few of the possible elements that might appear in a Table of Contents.
When processing documents this way, be aware that you are transforming the pre-processed,
normalized versions of your input documents. For example, whether or not you put
info
wrappers around the titles of your sections, in the pre-processed input,
title
s always appear inside info
wrappers.
This normalization greatly simplifies processing in many places.