4. Implementation details
This section sketches out some features of the implementation. It would probably be better to build an annotated Definitive Guide or something, but this will have to do for now.
4.1 Customizing chunking
Chunking is controlled by the $chunk-include
and
$chunk-exclude
parameters. These parameters are both
strings that must contain an XPath expression.
For each node in the document, the $chunk-include
parameter is evaluated. If it does not return an empty sequence, the element
is considered a chunking candidate. In this case, the
$chunk-exclude
parameter is evaluated. If the exclude
expression does return an empty sequence, then the element identified
becomes a chunk. (If the exclude expression returns a non-empty value, the element
will not become a chunk.)
4.2 Lengths and units
Lengths appear in the context of images (width and height) and tables (column widths). Several different units of length are possible: absolute lengths (e.g., 3in), relative lengths (e.g., 3*), and percentages (e.g., 25%). In some contexts, these can be combined: a column width of “3*+0.5in” should have a width equal to 3 times the relative width plus ½ inch.
In practice, some of the more complicated forms in TR
9502:1995 have no direct mapping to the units available in
HTML and CSS. The stylesheets attempt to specify a mapping that’s
close. Broadly, they take the nominal width of the table
($nominal-page-width
, subtract out the fixed
widths, divide up the remaining widths proportionally among the
relative widths, and compute final widths. The final widths can be
expressed either in absolute terms or as percentages.
In handling the width and height of images, the intrinsic width
and height of the image in pixels are converted into lengths by
dividing by $pixels-per-inch
. Nominal widths are
taken into consideration if necessary.
Determining the intrinsic size of an image depends on an extension function. See Section 2.5, Extension functions. Many bitmap image formats are supported. The bounding box of EPS images is used, if it’s present. The intrinsic size of SVG images is not available.
The list of recognized units (in, cm, etc.) are taken from
$v:unit-scale
.
4.3 Verbatim styles
There are three verbatim styles: lines
, plain
,
and raw
.
lines
In the lines style, each line of the verbatim environment is marked up individually. In this style, lines can be numbered and callouts can be inserted.
plain
In the plain style, callouts can be inserted, but additional markup is not added (except for the callouts). Consequently, it isn’t possible to do line numbering or syntax highlighting. (It may be possible to provide these features with JavaScript libraries in the browser, however.)
raw
In the raw style, no changes are made to the verbatim content. It’s output as it appears. Inline markup that it contains,
emphasis
or other elements, will be processed, but you cannot add line numbers, callouts, or syntax highlighting.
Consult Reference I, Parameter reference for a variety of parameters that control aspects of verbatim processing.
4.4 Mediaobject URIs
Media object (images, video, audio) URIs are tricky to handle. It’s most convenient if the URIs in the source documents point to the actual media files. This allows extensions, like the image properties extension function, to access the files. At the same time, the references generated in the HTML have to point to the locations where they will be published.
The stylesheets attempt to handle this by using
$mediaobject-input-base-uri
to locate the files from the
XML sources and $mediaobject-output-base-uri
to create
the HTML links.
If you need a completely different organization of images in the
input and output, you may need to define your own f:resolve-object-uri()
function.
4.5 Templates
It’s difficult to make title pages for documents easy to customize. There
is a lot of variation between documents and the styles can
have very precise design constraints. At the end of the day, if you need complete control,
you can define a template that matches the element in the
m:generate-titlepage
mode and generate all of the markup you wish.
The default title page handling attempts to make some declarative customization possible by using templates. A typical header template looks like this:
1<db:section>
<header>
<tmp:apply-templates select="db:title">
<h2><tmp:content/></h2>
5 </tmp:apply-templates>
<tmp:apply-templates select="db:subtitle">
<h3><tmp:content/></h3>
</tmp:apply-templates>
</header>
10</db:section>
Any HTML element in the template will be copied to the output. The semantics
of a “template apply templates” element (tmp:apply-templates
) is that
it runs the ordinary xsl:apply-templates
instruction on the elements that
match its select expression. If they result is the empty sequence (e.g, if there is no
subtitle
), nothing is output. If there is a result, the content of the
tmp:apply-templates
element is processed. Anywhere that
tmp:content
appears, the result of applying templates will be output.
In this example, if the title is “H2O” and there is no subtitle, the resulting HTML title page will be:
<header>
<h2>H<sub>2</sub>O</h2>
</header>
4.6 Annotations
The stylesheets fully support annotations, including a number
of presentation styles enabled by JavaScript in the browser. They also
support an extension of the documented semantics of
annotation
.
Annotations are applied to elements with links. Either the
element must point to its annotations (with an annotations
attribute) or the annotations must point to the elements they annotate
(with an annotates
attribute). These are documented as
ID/IDREF links but they are not IDREFS
attributes
because annotations may be stored separately.
Starting in version 1.5.1, the DocBook xslTNG
Stylesheets⌖1 support a non-standard extension: if you place
the string xpath:
in the annotates
attribute of
an annotation
, then the rest of the attribute is assumed to contain
an XPath expression that points to the element(s) to which the annotation
applies. (If you want to put IDREF values before the xpath:
token,
that’s fine, but you can’t put them after; the expression continues to the end
of the attribute value.)
Suppose, for example, that you wanted to annotate the stylesheet
title in the previous paragraph. The standard mechanisms would require that
you either put an xml:id
attribute on the element or point to the
annotation from the element. With the XPath extension, you can do this:
1<annotation
annotates="xpath:preceding-sibling::db:para/db:citetitle"
xmlns:db="http://docbook.org/ns/docbook">
<para>This annotation applies to the stylesheet title.
5For a discussion of this annotation, see the
following paragraphs.</para>
</annotation>
When the XPath expression is evaluated, the annotation
element is the context item. Often, this means that you’ll want to start
the expression with id()
or /
.
The namespace context for the expression is also the annotation
element, that’s why I’ve added the DocBook namespace binding for the
db
prefix. In practice, if you’re doing this on
several annotations, you can just put all the namespace bindings on a common
ancestor. All of the bindings in scope on the
annotation
element are available in the expression.
Caveats:
There’s no way to have multiple XPath expressions. You can’t put “
xpath:
” in there twice. If you want an annotation to apply to multiple elements, you’ll have to construct a single expression that selects all the elements, or duplicate the annotation, or use ID/IDREF links.If this turns out to be a serious limitation in practice, additional syntax could be added to support multiple expressions, but it doesn’t seem necessary.
You can only select elements. There’s no way to select the third word in a particular paragraph, for example, unless it already has some markup around it. There’s also no way to select a comment or a processing instruction.
The placement of the annotation marker (“⌖” by default) can also be
controlled globally and on individual annotations. The
$annotation-placement
parameter provides global control.
To specify the position for an individual annotation, put the token
“before
” or “after
” in the role
attribute on the annotation
.