4. Implementation details

4.1 Customizing chunking

Chunking is controlled by the $chunk-include and $chunk-exclude parameters. These parameters are both strings that must contain an XPath expression.

For each node in the document, the $chunk-include parameter is evaluated. If it does not return an empty sequence, the element is considered a chunking candidate. In this case, the $chunk-exclude parameter is evaluated. If the exclude expression does return an empty sequence, then the element identified becomes a chunk. (If the exclude expression returns a non-empty value, the element will not become a chunk.)

4.2 Lengths and units

Lengths appear in the context of images (width and height) and tables (column widths). Several different units of length are possible: absolute lengths (e.g., 3in), relative lengths (e.g., 3*), and percentages (e.g., 25%). In some contexts, these can be combined: a column width of “3*+0.5in” should have a width equal to 3 times the relative width plus ½ inch.

In practice, some of the more complicated forms in TR 9502:1995 have no direct mapping to the units available in HTML and CSS. The stylesheets attempt to specify a mapping that’s close. Broadly, they take the nominal width of the table ($nominal-page-width, subtract out the fixed widths, divide up the remaining widths proportionally among the relative widths, and compute final widths. The final widths can be expressed either in absolute terms or as percentages.

In handling the width and height of images, the intrinsic width and height of the image in pixels are converted into lengths by dividing by $pixels-per-inch. Nominal widths are taken into consideration if necessary.

🛈︎
Note

Determining the intrinsic size of an image depends on an extension function. See Section 2.5, Extension functions. Many bitmap image formats are supported. The bounding box of EPS images is used, if it’s present. The intrinsic size of SVG images is not available.

The list of recognized units (in, cm, etc.) are taken from $v:unit-scale.

4.3 Verbatim styles

There are three verbatim styles: lines, plain, and raw.

lines

In the lines style, each line of the verbatim environment is marked up individually. In this style, lines can be numbered and callouts can be inserted.

plain

In the plain style, callouts can be inserted, but additional markup is not added (except for the callouts). Consequently, it isn’t possible to do line numbering or syntax highlighting. (It may be possible to provide these features with JavaScript libraries in the browser, however.)

raw

In the raw style, no changes are made to the verbatim content. It’s output as it appears. Inline markup that it contains, emphasis or other elements, will be processed, but you cannot add line numbers, callouts, or syntax highlighting.

Consult Reference I, Parameter reference for a variety of parameters that control aspects of verbatim processing.

4.4 Mediaobject URIs

Media object (images, video, audio) URIs are tricky to handle. It’s most convenient if the URIs in the source documents point to the actual media files. This allows extensions, like the image properties extension function, to access the files. At the same time, the references generated in the HTML have to point to the locations where they will be published.

The stylesheets attempt to handle this by using $mediaobject-input-base-uri to locate the files from the XML sources and $mediaobject-output-base-uri to create the HTML links.

If you need a completely different organization of images in the input and output, you may need to define your own f:resolve-object-uri() function.

4.5 Templates

It’s difficult to make title pages for documents easy to customize. There is a lot of variation between documents and the styles can have very precise design constraints. At the end of the day, if you need complete control, you can define a template that matches the element in the m:generate-titlepage mode and generate all of the markup you wish.

The default title page handling attempts to make some declarative customization possible by using templates. A typical header template looks like this:

1<db:section>
  <header>
    <tmp:apply-templates select="db:title">
      <h2><tmp:content/></h2>
5    </tmp:apply-templates>
    <tmp:apply-templates select="db:subtitle">
      <h3><tmp:content/></h3>
    </tmp:apply-templates>
  </header>
10</db:section>

Any HTML element in the template will be copied to the output. The semantics of a “template apply templates” element (tmp:apply-templates) is that it runs the ordinary xsl:apply-templates instruction on the elements that match its select expression. If they result is the empty sequence (e.g, if there is no subtitle), nothing is output. If there is a result, the content of the tmp:apply-templates element is processed. Anywhere that tmp:content appears, the result of applying templates will be output.

In this example, if the title is “H2O” and there is no subtitle, the resulting HTML title page will be:

<header>
  <h2>H<sub>2</sub>O</h2>
</header>