Details to watch out for

Details to watch out for
	Chapter 24. Olinking between documents

Olinks provide the tremendous power of cross referencing between documents, but they have a price. Olinks introduce dependencies between documents that are not an issue with standalone documents. The documents in a collection must "play together", and so they must follow a few rules.

If you change a document, you should always regenerate its target.db data file. Once a collection is set up, this step is most easily done by processing a modified document with the parameter collect.xref.targets set to the value yes. That will make two passes through the document, the first to regenerate the target data file and the second to generate the normal output.
It is a good idea to enter id attributes (use xml:id in DocBook 5) on any element you might want to link to. . Without an id attribute, the stylesheet will generate a value, but it will not necessarily be the same value in each process. For a stable target value, you must enter an id attribute. In HTML processing, if you set the stylesheet parameter id.warnings to 1, then you will get a warning about any titled elements that do not have an id attribute.
If you change a document, then you may need to reprocess other documents that make cross references to that data file. Such dependencies are most easily tracked using Makefiles or Ant tasks, so the update process can be automated.
The output locations specified in the sitemap element in the target database document must match where the HTML or PDF output actually lands. If they do not match, then the hot links you generate between documents will not reach the actual documents.
Whatever DocBook stylesheet (standard or customized) that you use to process a document for output should also be used to process the document for extracting the target data. Only then can you be sure that the style and content of the cross references will match the document.
If you use profiling (conditional text), you need to keep your target data separate for the different profiles. See the section “Olinks with profiling (conditional text)” for how to do that.
If you are generating XHTML with the Saxon processor, then you may have a problem with your target.db data files. That processor always adds a DOCTYPE declaration to the data files, and a system entity cannot contain a DOCTYPE. If you use system entities in your main target database to load the data files, then it will fail when using Saxon. You can use xsltproc instead, or use XIncludes instead of system entities in your target database. See the section “Using XInclude in the database document” for more details on the latter option.

Target database location

The location of the olink targets database is specified by the stylesheet parameter target.database.document. Note these features:

The target.database.document parameter has no default value, so you must always set this parameter if you are using olinks.
The parameter value can be a full path, but then it should be expressed using URI syntax since that is what the XSLT document() function expects. For example:
```
<xsl:param name="target.database.document">file:///c:/xml/tools/olinkdb.xml</xsl:Param>
```
The parameter value can be a relative path, in which case it is taken as relative to the directory containing the document being processed. Since the parameter takes URI syntax, you can use forward slashes even on Windows. You can include ../ in the path to access directory levels above the current document.
You cannot use a relative path if you are using profile-docbook.xsl or profile-chunk.xsl. Those stylesheets create an in-memory copy of your document that does not have a base directory. Use a full path, or use two-pass profiling (see the section “Two-pass processing”).
An XML catalog entry can be used to map the parameter value to a specific location in a filesystem. Note that when using a Java processor, the parameter value must be a full path expression in order for the catalog entry to work, because Java will replace any relative path with an absolute path before the catalog resolver sees it.

If you are sharing a target database among several documents, as is common with olinking, you should put it in a path that is accessible to all documents in the collection. If the relative path from all your documents to the database is the same, then you can just put that path in the parameter. If the relative paths to the shared database are not the same, you have some choices:

Set the target.database.document parameter in the build script for each document directory, using an appropriate relative path for documents in that directory.
Set the parameter to a fixed full path to the database file.
Use a phony full path that is mapped to the actual location using an XML catalog file. See the section “Relative SYSTEM identifiers may not work” for more information on this trick.

Using a sitemap

One of the most powerful features of the olink system is the sitemap in the target database document. The sitemap is an XML structure that parallels the directory structure of your HTML or PDF output tree. By recording the output locations for all the documents in your olink database, the stylesheet can compute relative links between any two documents. The stylesheets compute the correct number of ../ steps to move up, and the right sequence of directory names to move down to locate a file. Relative links make your HTML highly portable, as long as you keep the same directory structure when you move the files.

If you put all your output in one directory, then you do not need to use a sitemap. You can omit the sitemap and dir elements, and just create a flat list of document elements as children of the targetset element in the database file. For PDF output or non-chunked HTML output, the baseuri attribute of each document element must still contain the filename of its PDF or HTML output file, because that name is not available to the stylesheet.

Keep in mind that the sitemap records the HTML or PDF output hierarchy, not the XML source hierarchy. The location of your XML documents does not matter. Creating an output sitemap requires advanced planning for your document collection. You need to decide the name and location of each directory containing output. If you change where you put your HTML or PDF files, be sure to update your sitemap as well.

For the sitemap to work, you have to set the current.docid parameter for each document you process. You set the parameter value to the targetdoc identifier for the current document. That informs the stylesheet of the starting point for computing relative references, since that information is not recorded in the document itself.

Here are some guidelines for understanding the sitemap feature. See Example 24.1, “Target database document” for examples.

The sitemap element itself must contain just a single top-level dir element that serves as a container for the other dir elements. The top-level name attribute is irrelevant, since it is never used in hrefs (it is always represented by ../).
The output directory hierarchy is represented by nested dir elements under the top level dir in the sitemap. Each dir element's name attribute must match the name of its output directory. Thus a sequence of dir descendants can represent part of a pathname.
A dir element can also contain one or more document elements. A typical setup will have terminal dirs containing a single document element, especially if that document is chunked. But a dir element can contain a document element and other dir elements, if that is your directory structure.
Each document element's targetdoc attribute value is the same document identifier used for olinking to that document. This identifier keys the stylesheet to the current document's location in the sitemap so it can compute relative paths from there to other documents.
The content of each document element is the set of target data collected for that document. This is usually inserted as a system entity reference, although XInclude can be used as well (see the section “Using XInclude in the database document”).
Non-chunked documents may need a baseuri attribute on their document element to indicate the HTML filename. This is necessary if the olink.base.uri parameter was not used to write the same filename into each href in the target data.when collecting the document's target data. do not use both the parameter and the attribute, or both will appear in the generated hrefs.
A directory can contain the output for more than one document. Expressed in the sitemap, this means a dir element can contain more than one document element. This feature is most useful for putting together several non-chunked documents. Chunked documents run the risk of duplicate filenames that would overwrite each other.


Chapter 24. Olinking between documents		Olinking in print output