use dblatex for producing the PDF doc. We could now go full XML

This commit is contained in:
Jean-Francois Dockes 2013-03-30 17:14:40 +01:00
parent fc848f48ff
commit ed91113eab
3 changed files with 66 additions and 56 deletions

View file

@ -1,9 +1,9 @@
= Building the Recoll user manual
The Recoll user manual usually used DocBook SGML and used the FreeBSD doc
toolchain to produce the output formats. This had the advantage of an easy
way to produce all formats including a PDF manual, but presented two
problems:
The Recoll user manual used to be written in DocBook SGML and used the
FreeBSD doc toolchain to produce the output formats. This had the advantage
of an easy way to produce all formats including a PDF manual, but presented
two problems:
- Dependancy on the FreeBSD platform.
- No support for UTF-8 (last I looked), only latin1.
@ -17,21 +17,14 @@ made was to make the anchors explicitly upper-case because the SGML
toolchain converts them to upper-case and the XML one does not, so the only
way to have compatibility is to make them upper-case in the first place.
We still have a problem for producing the PDF manual, because few
straightforward approaches seem to exist:
We initially had a problem for producing the PDF manual, which motivated
keeping the SGML version for producing the PDF with the FreeBSD SGML
toolchain. This problem is now solved with dblatex, so that the SGML
version now has little reason to persist and it will go away at some point
in the future.
- http://docbookpublishing.com qui a meme une version programmatique (cf:
http://docbookpublishing.com/api/), mais necessite un peu de
configuration.
- FOP but this is Java and complicated.
Asciidoc would also be a candidate as the source format, because it can
easily produce docbook, so the future will probably be:
See also http://www.valdyas.org/linguistics/printing_unicode.html
Does not look simple, but dates from 2002 and seems to imply that FOP is
making progress.
The current conclusion would seem to be that the SGML version should stay
operational to give an easy way to make the PDF one on FreeBSD.
But see also notes about dblatex on the asciidoc page. Actually asciidoc would
be a candidate replacement for the source format.
http://www.methods.co.nz/asciidoc/userguide.html
asciidoc->docbook-xml-> html
-> pdf

View file

@ -1741,7 +1741,8 @@ fvwm
<replaceable>*coll</replaceable>), the expansion can take quite
a long time because the full index term list will have to be
processed. The expansion is currently limited at 10000 results for
wildcards and regular expressions.</para>
wildcards and regular expressions. It is possible to change the
limit in the configuration file.</para>
<para>Double-clicking on a term in the result list will insert
it into the simple search entry field. You can also cut/paste
@ -2504,7 +2505,7 @@ fvwm
<command>konqueror</command>.</para>
<para>This can be done by either explicitly inserting
<literal>&lt;a&nbsp;href="recoll:/..."&gt;</literal> links
<literal><![CDATA[<a href="recoll://...">]]></literal> links
around some document areas, or automatically by adding a
very small <application>javascript</application> program to the
documents, like the following example, which would initiate a search by
@ -3061,30 +3062,36 @@ dir:recoll dir:src -dir:utils -dir:common
</listitem>
</itemizedlist>
<para>You should be aware of a few things before using
<para>You should be aware of a few things when using
wildcards.</para>
<itemizedlist>
<listitem><para>Using a wildcard character at the beginning of
a word can make for a slow search because &RCL; will have to
scan the whole index term list to find the matches.</para>
</listitem>
<listitem><para>When working with a raw index (preserving
character case and diacritics), the literal part of a wildcard
expression will be matched exactly for case and
diacritics.</para>
</listitem>
a word can make for a slow search because &RCL; will have to
scan the whole index term list to find the
matches. However, this is much less a problem for field
searches, and queries
like <replaceable>author:*@domain.com</replaceable> can
sometimes be very useful.</para></listitem>
<listitem><para>For &RCL; version 18 only, when working with a
raw index (preserving character case and diacritics), the
literal part of a wildcard expression will be matched
exactly for case and diacritics. This is not true any
more for versions 19 and later.</para></listitem>
<listitem><para>Using a <literal>*</literal> at the end of a
word can produce more matches than you would think, and
strange search results. You can use the <link
linkend="RCL.SEARCH.GUI.TERMEXPLORER">term explorer</link> tool to
check what completions exist for a given term. You can also
see exactly what search was performed by clicking on the link
at the top of the result list. In general, for natural
language terms, stem expansion will produce better results
than an ending <literal>*</literal> (stem expansion is turned
off when any wildcard character appears in the term).</para>
</listitem>
word can produce more matches than you would think, and
strange search results. You can use the
<link linkend="RCL.SEARCH.GUI.TERMEXPLORER">term
explorer</link> tool to check what completions exist for
a given term. You can also see exactly what search was
performed by clicking on the link at the top of the result
list. In general, for natural language terms, stem
expansion will produce better results than an
ending <literal>*</literal> (stem expansion is turned off
when any wildcard character appears in the
term).</para></listitem>
</itemizedlist>
</sect2> <!-- wildchars -->
@ -4423,7 +4430,7 @@ except:
<ulink url="mailto:jfd@recoll.org">I would
very much welcome patches</ulink>.</para>
<para>Depending on the <application>Qt&nbsp;3</application>
<para>Depending on the <application>Qt 3</application>
configuration on your system, you may have to set the
<envar>QTDIR</envar> and <envar>QMAKESPECS</envar>
variables in your environment:</para>
@ -4448,7 +4455,8 @@ except:
<para>Neither <envar>QTDIR</envar> nor
<envar>QMAKESPECS</envar> should be needed with
Qt&nbsp;4, configuration details are entirely determined by
<application>Qt 4</application>,
configuration details are entirely determined by
<command>qmake</command> (which is quite often installed as
<command>qmake-qt4</command>).</para>
@ -4769,7 +4777,7 @@ skippedNames = #* bin CVS Cache cache* caughtspam tmp .thumbnails .svn \
<para>Example of use for skipping text files only in a
specific directory:</para>
<programlisting>
skippedPaths = ~/somedir/&lowast;.txt
skippedPaths = ~/somedir/*.txt
</programlisting>
</listitem>
</varlistentry>

View file

@ -1,16 +1,21 @@
#!/bin/sh
# A script to produce the Recoll manual with an xml toolchain.
# Tools used:
# - xsltproc
# - The docbook-xsl styleets
# - dblatex for producing the PDF.
#
# Limitations:
# - Does not produce the links to the whole/chunked versions at the top
# of the document
# - The anchor names from the source text are converted to uppercase by
# the sgml toolchain. This does not happen with the xml toolchain,
# which means that external links like
# usermanual.html#RCL.CONFIG.INDEXING won't work because fragments are
# case-sensitive. This could be solved by converting all ids inside the
# source file to upper-case.
# - No simple way to produce pdf
# - The anchor names from the source text are converted to uppercase
# by the sgml toolchain. This does not happen with the xml
# toolchain, which means that external links like
# usermanual.html#RCL.CONFIG.INDEXING won't work because fragments
# are case-sensitive. This has been solved by converting all ids
# inside the source file to upper-case. DON'T REINTRODUCE
# lower-case IDS
# Wherever docbook.xsl and chunk.xsl live
# Fbsd
@ -23,14 +28,15 @@ XSLDIR="/usr/share/xml/docbook/stylesheet/docbook-xsl/"
dochunky=1
test $# -eq 1 && dochunky=0
# Remove the SGML header and uncomment the XML one + convert from iso-8859-1
# to utf-8
# Remove the SGML header and uncomment the XML one. Also used to iconv
# from iso-8859-1 to UTF-8, but the SGML manual is now UTF-8 ? Would
# that work with the sgml toolchain ??
echo '<?xml version="1.0" encoding="UTF-8"?>' > usermanual.xml
sed -e '\!//FreeBSD//DTD!d' \
-e '\!DTD DocBook XML!s/<!--//' \
-e '\!/docbookx.dtd!s/-->//' \
< usermanual.sgml \
| iconv -f iso-8859-1 -t utf-8 \
> usermanual.xml
>> usermanual.xml
# Options common to the single-file and chunked versions
commonoptions="--stringparam section.autolabel 1 \
@ -59,3 +65,6 @@ eval xsltproc $commonoptions \
tidy -indent usermanual-xml.html > tmpfile
mv -f tmpfile usermanual-xml.html
# And the pdf with dblatex
dblatex usermanual.xml