Tab expansion

Often a program listing will include tab characters to indent lines of code. When such a listing is imported to DocBook XML and formatted, the tab characters are not expanded as they are in the program editor. That's because in XSL-FO and HTML, tab stops and tab expansion are not described in either HTML or XSL-FO standards. By default, an XSL-FO processor treats a tab character as a single space, which leads to unsatisfactory results.

One solution is to prepare each such program listing by converting each tab character to a suitable number of space characters. The algorithm is not quite as simple as replacing each tab with a fixed number of spaces, because tab stops are at fixed locations and not fixed space widths. If you only have a few listings and you do not have to update them in the future, this is a feasible solution. If you are importing many such program listings, or they need to be updated on a regular basis, then converting the tabs ahead of processing is a tedious chore.

There is an extension function written in Java for the Saxon XSL processor that will expand tab characters so they appear formatted as in a program editor. The extension was written by Tomas Hajek and is available for download from http://www.tomashajek.net/. Here are the steps for using this extension with the Saxon processor.

  1. Download the saxon.tomashajek.net1.0.jar file from his website, install it in a convenient location, and add it to your Java CLASSPATH.

  2. In your customization layer, add the following to your stylesheet root element:

    <xsl:stylesheet  
        xmlns:hajek="http://net.tomashajek.saxon.Tabify"
        exclude-result-prefixes="hajek"
        ...

    If your stylesheet already has an exclude-result-prefixes attribute, then just add the hajek prefix name to the list.

  3. Add the following template that executes the extension function if it is available:

    <xsl:template match="text()">
      <xsl:choose>
        <xsl:when test="function-available('hajek:tabify')">
          <xsl:value-of select="hajek:tabify(.,8)"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="."/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:template>
    

    The second argument to the tabify() function (8, in this case) indicates the spacing of tab stops.

Once this is set up, you should be able to process program listings that include tab characters and they should expand. If they do not expand, it is most likely because the command is not finding the function due to problems with the Java CLASSPATH.