The index page number problems described in the previous section cannot be solved by the DocBook XSL stylesheets because the page number for a given
indexterm is not known in the XSLT step. Text is placed on pages by the XSL-FO processor, which does not necessarily recognize that text is an index entry. Also, there are no properties in the XSL-FO standard to consolidate page ranges.
Some FO processors such as XEP and Antenna House have extension functions that can be used to fix up index page numbers. The DocBook XSL stylesheets output these indexing extensions if the
xep.extensions parameter or the
axf.extensions parameter, respectively, is set to 1. The FOP processor
does not yet have
For FOP, one solution to this problem is to extract page number information from the PDF output file, and then use that to fix up the FO file. This method is described briefly on the reference page for the
make.index.markup parameter. The following is a summary of the
You need a utility named
pstotext to extract information from PDF files. It is available packaged in an RPM for Linux from http://rpmfind.net.
Process your document containing an empty
<index/> element with the
fo/docbook.xsl stylesheet with the
make.index.markup parameter set to 1. That will generate the index but
will insert it as XML markup in the FO file. For example:
xsltproc -o mybook.fo \ --stringparam make.index.markup 1 \ fo/docbook.xsl mybook.xml
Convert the FO file to PDF using your favorite XSL-FO processor.
Execute this Perl script on your PDF file and save the output to a file:
fo/pdf2index mybook.pdf > myindex.xml
The content of that
myindex.xml is an index marked up with DocBook index elements, with page information inserted as well.
Replace the empty
<index/> element in your document with the contents of this generated file. You can do it with a system entity or XInclude.
Process your document again with
fo/docbook.xsl and your favorite XSL-FO processor, this time omitting the
The result of this process is a PDF file for your document that contains an index with page numbers properly collapsed. Duplicate numbers should be removed, and sequences of consecutive pages should appear as page ranges.
|DocBook XSL: The Complete Guide - 4th Edition||PDF version available|
Copyright © 2002-2007 Sagehill Enterprises