Chapter 6. DocBook publishing with OpenOffice

Table of Contents

A very easy possibility to work with docbook is to use OpenOffice as an editor. OpenOffice uses xml as natural file format, therefore it is quite obvious to think about a conversion from OpenOffice writer files to docbook xml. Indeed this is relatively easy. The following steps are necessary:

To make it clearer, what exactly is going on, the procedure is to be described briefly.

OpenOffice stores its documents in files with the extension swx. These files are actually ZIP-archives[4], containing serveral XML-files. One can extract and rearrange the individual files very easily.

In order to be able to transform via XSLT at the end, one single large XML file is merged, whereby the different part remain separate by XSLT namespaces.

Pictures are embedded by OpenOffice directly into the XML file and stored as base64 in the XML file. Unfortunately the usual DocBook stylesheet are not able to use these embedded pictures, but rather expect them as external files.

All these steps are settled by a small Python module of Eric Bellot. It can downloaded here and is part of the whole framework either.

The following transformation is accomplished with a special stylesheet using an arbitrary XSLT-processor (here saxon). The stylesheet of Eric Bellot was improved for this framework and there are serveral variants, depending upon whether one would like to write a „book“ or an „article“.

6.1. Interaction between OpenOffice and DocBook

OpenOffice uses paragraph formats, character formats and frame formats for formatting documents. The are named and linked to certain layout rules. The connection between DocBook and OpenOffice is made exactly over these predefined formats. If for instance a sentence contains a filename, then it is formatted in OpenOffice with the character format „filename“ and converted later into <filename>...</filename> .

It is the same with paragraph format for enumeration or numbered lists: one can use them in OpenOffice as usual and they will be converted in the corrosponding DocBook tags automatically. To make this always work right, one should use a document template or an empty standard document. Such a document is included in the framework.

The most corrosponding formats / tags are intuitive, a more detailed description is on Eric Bellot's website.

Fussnoten

[4] One can open it actually e.g. simply with WinZip