Entities with DocBook 5

When you switch from DTDs to RelaxNG (or to W3C XML Schema, for that matter), you lose the ability to define XML entities in the schema. Neither RelaxNG nor the XML Schema language provide a mechanism for declaring entities. Although there were probably good reasons for these decisions, it comes as a surprise to those who find entities to be a very useful feature of XML.

You can still use entities in DocBook 5, but you cannot declare them in the RelaxNG schema. Instead, you must reference your entity declarations in the DOCTYPE declaration of each document that needs them.

An XML DOCTYPE declaration can have two parts. The external DTD subset is the part that is referenced by the PUBLIC and SYSTEM identifiers of document's DTD. The internal DTD subset consists of any DTD declarations inside the document itself, enclosed within a set of square brackets within the DOCTYPE. When using RelaxNG as your DocBook schema, you can skip the external subset because you are not using a DTD, and just declare an internal subset. See the following example.

<?xml version="1.0"?>
<!DOCTYPE book [
<!ENTITY company "Acme Widgets, Inc.">
<!ENTITY product "Top Widget">
...
]>
<book xmlns="http://docbook.org/ns/docbook" version="5.0">
...

With these declarations in place, you can use entity references in your document and they will be valid. See the next section for a method of storing these declarations in a separate file.

Separate DocBook 5 entities file

Maintaining a consistent set of entities in all your documents' DOCTYPEs will be a maintenance headache if you have many documents. So instead, you should put all your entity declarations in a separate file and reference it using a parameter entity. A parameter entity is an entity that is used only within DTD declarations. In this case, you will use a parameter SYSTEM entity to reference your external file of entity declarations. The following describes how it is done:

  1. Create a file such as myentities.ent containing your entity declarations:

    <!ENTITY company "Acme Widgets, Inc.">
    <!ENTITY product "Top Widget">
    <!ENTITY productversion "11.3">
    ...
  2. In each document's DOCTYPE, declare a parameter system entity and then immediately reference it :

    <?xml version="1.0"?>
    <!DOCTYPE book [
    <!ENTITY % myent SYSTEM "/path/to/myentities.ent">
    %myent;
    ]>
    <book xmlns="http://docbook.org/ns/docbook" version="5.0">
    ...

If the path to the entities file resolves properly, then the entities in it are available to be used in the document. When you update the entity declarations in myentities.ent, the changes will apply to all documents that reference the file.

If you want to swap in a different entities file, you can use an XML catalog to map the system identifier in the parameter entity declaration to another filename. This can be useful if you maintain several parallel entity sets that you use for different products, for example. You can choose an entity set at runtime by selecting a different catalog file. See Chapter 5, XML catalogs for more information.

DocBook character entities

Even if you do not define your own entities, you may want to make use of the predefined entities in DocBook for the various special characters. The named character entities described in the section “Special characters” are normally included in the DocBook DTD that is referenced by a document. If you are using RelaxNG instead of a DTD, those entities are not automatically available.

To use the DocBook character entities, you need to locate a set of files that include entity declarations such as the following:

<!ENTITY bull             "&#x02022;" ><!--BULLET -->
<!ENTITY caret            "&#x02041;" ><!--CARET INSERTION POINT -->
<!ENTITY check            "&#x02713;" ><!--CHECK MARK -->
<!ENTITY cir              "&#x025CB;" ><!--WHITE CIRCLE -->
...

The official source for these entity declarations is the W3C website http://www.w3.org/2003/entities/. The entities commonly used in DocBook are under the heading “ISO 8879” on the website. Rrecent versions of the DocBook version 4 DTD included copies of those entity declarations for convenience. They are no longer included, starting with DocBook 5.

To use the W3C entities, you can download the declaration files and reference them with parameter entities as described in the previous section. Or you can directly reference them over the Internet by putting a URL in the system identifier of the parameter system entity. Internet access may slow your processing down, though.

A more convenient way to get these entity declarations for DocBook 5 is to install version 4.5 of the DocBook DTD, which includes the entities. Then you can reference a single file named dbcentx.mod that in turn references all the entity declarations. For example, if you install the DocBook 4.5 DTD in /usr/local/docbook, then your document can reference all the entity declarations this way:

<?xml version="1.0"?>
<!DOCTYPE book [
<!ENTITY % sgml.features "IGNORE">
<!ENTITY % xml.features "INCLUDE">
<!ENTITY % dbcent PUBLIC "-//OASIS//ENTITIES DocBook Character Entities V4.5//EN"
   "/usr/local/docbook/dbcentx.mod">
%dbcent;
]>
<book xmlns="http://docbook.org/ns/docbook" version="5.0">
...

This is how the DocBook 4.5 DTD references the same entities. First the DOCTYPE declares two entities to turn off the SGML features of the DTD and turn on the XML features (these are needed to access the entity declarations). Then the DOCTYPE declares a parameter system entity with PUBLIC and SYSTEM identifiers for the dbcentx.mod file. Then it references that parameter entity using %dbcent; to load all the entity declarations. All you need to do is change the system identifier to match the path to the dbcentx.mod module as it is installed on your system. Or use an XML catalog file as described in Chapter 5, XML catalogs.

To avoid putting all this syntax in each document, move it all to a separate entities declaration file as described in the the section “Separate DocBook 5 entities file”. Then a single reference in each file's DOCTYPE gets everything.