This is a simple recipe for creating a simple site which holds some docbook XML files in a container, and presents them in an index of document titles. When each title is clicked, the Norm Walsh docbook transforms and FOP are used to generate PDF on the fly. It also demonstrates a poor man's full text search using aggressive XPath indexing into RDF.
This recipe requires a release or CVS snapshot of 4Suite more recent than 29 October 2002. You will also need JDK. I used 1.4 for Linux and Windows.
First of all, download the minor ingredients (dbpdf-0.1.tar.gz). This package contains the files you need, except for the meat and potatoes, Norm Walsh's docbook stylesheets and FOP. I used docbook-xsl 1.52.2 and FOP 0.20.4.
First make sure that you have the 4Suite repository set up. The best way to be sure is to follow the directions in the Quick start and to be sure you set up a scratch server as suggested. Unpack dbpdf-0.1.tar.gz, change to the created directory, and unpack the docbook XSL package. Rename the docbook XSL directory (for example "docbook-xsl-1.52.2") to "docbook-xsl".
Make sure FOP is installed, and check the top-level parameters in view-as-pdf.xslt to be sure the paths are correct for running FOP from the script.
Now make sure the repository is running and run the following commands to install all the scripts and data needed (including parts of the docbook XSL package):
4ss install setup1.xml 4ss install setup2.xml 4ss install setup3.xml
4Suite unfortunately has problems with the dependency tracking, which means for now that I had to create several separate setup.xml files.
Now, assuming your scratch server is running on port 8080, you can try out the meal on your server:
http://localhost:8080/dbpdf/?xslt
Some quick things to notice (I hope to fill out this section soon with more juicy detail):
- title-index.dd demonstrates a very rough approach to indexing. It indexes all title and para elements in the docbook in RDF using the http://uche.ogbuji.net/etc/021027/content RDF predicate.
- search-handler.xslt demonstrates the use of this index to drive a search engine of the docbook documents in the demo
- listing.xslt is a handy general template for XSLT scripts that operate on a container listing. Just import this script and define your own "handle-item" template.
- I could have generated the document index by RDF query rather than plumbing one container's metadata. The following XSLT snippet is an illustration:
<xsl:variable name="items" f:node-set="yes">
<frdf:versa-query query="distribute(type(d:ruledoc), '.', '.-dc:title->*')"/>
</xsl:variable>
<xsl:for-each select='$items/List/List'>
<LI>
<a href="{*[1]}?xslt=/sos/view-as-pdf.xslt"><xsl:value-of select="*[2]"/></a>
</LI>
</xsl:for-each>
...
- If you want to add documents beyond the basic three provided, you can simply drop in the docbook files and run "4ss install setup3.xml" again. Actually, this demo doesn't support proper docbook files because it does not install the DOcbook XML DTD into the repository. I omitted this as a simplification, although I have demonstrated a system based on full Docbook DTDs without much difficulty.
- view-as-pdf.xslt imports Docbook XSL and uses xsl:apply-imports to have that code process te source document. It uses exslt:document to write the resulting formatting objects document to the disk. exslt:document requires a proper URI, not just a regular OS path. f:ospath2uri and f:uri2ospath are useful extensions for going back and forth between these forms. Finally, it launches a sub-shell (f:system) to run FOP on the generated FO file. It reads back the generated PDF as a big string and feeds it to the browser.
- view-as-pdf.xslt reads the PDF file back from the file system as mock binary (ISO-8859-1), and then sends it as id to the browser by using ISO-8859-1 output encoding and the standard PDF media type. This is a naughty hack as the PDF file could contain characters illegal in XML, which, nevertheless, pass through the XSLT engine as is. Then again, one could argue this shouldn't bother anyone but strict XML lawyers.
If anyone has more cross-platform tips, or tips with using other PDF rendering tools in such fashion, please let me know.
