use dblatex for producing the PDF doc. We could now go full XML

This commit is contained in:
Jean-Francois Dockes 2013-03-30 17:14:40 +01:00
parent fc848f48ff
commit ed91113eab
3 changed files with 66 additions and 56 deletions

View file

@ -1,9 +1,9 @@
= Building the Recoll user manual = Building the Recoll user manual
The Recoll user manual usually used DocBook SGML and used the FreeBSD doc The Recoll user manual used to be written in DocBook SGML and used the
toolchain to produce the output formats. This had the advantage of an easy FreeBSD doc toolchain to produce the output formats. This had the advantage
way to produce all formats including a PDF manual, but presented two of an easy way to produce all formats including a PDF manual, but presented
problems: two problems:
- Dependancy on the FreeBSD platform. - Dependancy on the FreeBSD platform.
- No support for UTF-8 (last I looked), only latin1. - No support for UTF-8 (last I looked), only latin1.
@ -17,21 +17,14 @@ made was to make the anchors explicitly upper-case because the SGML
toolchain converts them to upper-case and the XML one does not, so the only toolchain converts them to upper-case and the XML one does not, so the only
way to have compatibility is to make them upper-case in the first place. way to have compatibility is to make them upper-case in the first place.
We still have a problem for producing the PDF manual, because few We initially had a problem for producing the PDF manual, which motivated
straightforward approaches seem to exist: keeping the SGML version for producing the PDF with the FreeBSD SGML
toolchain. This problem is now solved with dblatex, so that the SGML
version now has little reason to persist and it will go away at some point
in the future.
- http://docbookpublishing.com qui a meme une version programmatique (cf: Asciidoc would also be a candidate as the source format, because it can
http://docbookpublishing.com/api/), mais necessite un peu de easily produce docbook, so the future will probably be:
configuration.
- FOP but this is Java and complicated.
See also http://www.valdyas.org/linguistics/printing_unicode.html asciidoc->docbook-xml-> html
Does not look simple, but dates from 2002 and seems to imply that FOP is -> pdf
making progress.
The current conclusion would seem to be that the SGML version should stay
operational to give an easy way to make the PDF one on FreeBSD.
But see also notes about dblatex on the asciidoc page. Actually asciidoc would
be a candidate replacement for the source format.
http://www.methods.co.nz/asciidoc/userguide.html

View file

@ -1741,7 +1741,8 @@ fvwm
<replaceable>*coll</replaceable>), the expansion can take quite <replaceable>*coll</replaceable>), the expansion can take quite
a long time because the full index term list will have to be a long time because the full index term list will have to be
processed. The expansion is currently limited at 10000 results for processed. The expansion is currently limited at 10000 results for
wildcards and regular expressions.</para> wildcards and regular expressions. It is possible to change the
limit in the configuration file.</para>
<para>Double-clicking on a term in the result list will insert <para>Double-clicking on a term in the result list will insert
it into the simple search entry field. You can also cut/paste it into the simple search entry field. You can also cut/paste
@ -2504,7 +2505,7 @@ fvwm
<command>konqueror</command>.</para> <command>konqueror</command>.</para>
<para>This can be done by either explicitly inserting <para>This can be done by either explicitly inserting
<literal>&lt;a&nbsp;href="recoll:/..."&gt;</literal> links <literal><![CDATA[<a href="recoll://...">]]></literal> links
around some document areas, or automatically by adding a around some document areas, or automatically by adding a
very small <application>javascript</application> program to the very small <application>javascript</application> program to the
documents, like the following example, which would initiate a search by documents, like the following example, which would initiate a search by
@ -3061,30 +3062,36 @@ dir:recoll dir:src -dir:utils -dir:common
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>You should be aware of a few things before using <para>You should be aware of a few things when using
wildcards.</para> wildcards.</para>
<itemizedlist> <itemizedlist>
<listitem><para>Using a wildcard character at the beginning of <listitem><para>Using a wildcard character at the beginning of
a word can make for a slow search because &RCL; will have to a word can make for a slow search because &RCL; will have to
scan the whole index term list to find the matches.</para> scan the whole index term list to find the
</listitem> matches. However, this is much less a problem for field
<listitem><para>When working with a raw index (preserving searches, and queries
character case and diacritics), the literal part of a wildcard like <replaceable>author:*@domain.com</replaceable> can
expression will be matched exactly for case and sometimes be very useful.</para></listitem>
diacritics.</para>
</listitem> <listitem><para>For &RCL; version 18 only, when working with a
raw index (preserving character case and diacritics), the
literal part of a wildcard expression will be matched
exactly for case and diacritics. This is not true any
more for versions 19 and later.</para></listitem>
<listitem><para>Using a <literal>*</literal> at the end of a <listitem><para>Using a <literal>*</literal> at the end of a
word can produce more matches than you would think, and word can produce more matches than you would think, and
strange search results. You can use the <link strange search results. You can use the
linkend="RCL.SEARCH.GUI.TERMEXPLORER">term explorer</link> tool to <link linkend="RCL.SEARCH.GUI.TERMEXPLORER">term
check what completions exist for a given term. You can also explorer</link> tool to check what completions exist for
see exactly what search was performed by clicking on the link a given term. You can also see exactly what search was
at the top of the result list. In general, for natural performed by clicking on the link at the top of the result
language terms, stem expansion will produce better results list. In general, for natural language terms, stem
than an ending <literal>*</literal> (stem expansion is turned expansion will produce better results than an
off when any wildcard character appears in the term).</para> ending <literal>*</literal> (stem expansion is turned off
</listitem> when any wildcard character appears in the
term).</para></listitem>
</itemizedlist> </itemizedlist>
</sect2> <!-- wildchars --> </sect2> <!-- wildchars -->
@ -4423,7 +4430,7 @@ except:
<ulink url="mailto:jfd@recoll.org">I would <ulink url="mailto:jfd@recoll.org">I would
very much welcome patches</ulink>.</para> very much welcome patches</ulink>.</para>
<para>Depending on the <application>Qt&nbsp;3</application> <para>Depending on the <application>Qt 3</application>
configuration on your system, you may have to set the configuration on your system, you may have to set the
<envar>QTDIR</envar> and <envar>QMAKESPECS</envar> <envar>QTDIR</envar> and <envar>QMAKESPECS</envar>
variables in your environment:</para> variables in your environment:</para>
@ -4448,7 +4455,8 @@ except:
<para>Neither <envar>QTDIR</envar> nor <para>Neither <envar>QTDIR</envar> nor
<envar>QMAKESPECS</envar> should be needed with <envar>QMAKESPECS</envar> should be needed with
Qt&nbsp;4, configuration details are entirely determined by <application>Qt 4</application>,
configuration details are entirely determined by
<command>qmake</command> (which is quite often installed as <command>qmake</command> (which is quite often installed as
<command>qmake-qt4</command>).</para> <command>qmake-qt4</command>).</para>
@ -4769,7 +4777,7 @@ skippedNames = #* bin CVS Cache cache* caughtspam tmp .thumbnails .svn \
<para>Example of use for skipping text files only in a <para>Example of use for skipping text files only in a
specific directory:</para> specific directory:</para>
<programlisting> <programlisting>
skippedPaths = ~/somedir/&lowast;.txt skippedPaths = ~/somedir/*.txt
</programlisting> </programlisting>
</listitem> </listitem>
</varlistentry> </varlistentry>

View file

@ -1,16 +1,21 @@
#!/bin/sh #!/bin/sh
# A script to produce the Recoll manual with an xml toolchain. # A script to produce the Recoll manual with an xml toolchain.
# Tools used:
# - xsltproc
# - The docbook-xsl styleets
# - dblatex for producing the PDF.
#
# Limitations: # Limitations:
# - Does not produce the links to the whole/chunked versions at the top # - Does not produce the links to the whole/chunked versions at the top
# of the document # of the document
# - The anchor names from the source text are converted to uppercase by # - The anchor names from the source text are converted to uppercase
# the sgml toolchain. This does not happen with the xml toolchain, # by the sgml toolchain. This does not happen with the xml
# which means that external links like # toolchain, which means that external links like
# usermanual.html#RCL.CONFIG.INDEXING won't work because fragments are # usermanual.html#RCL.CONFIG.INDEXING won't work because fragments
# case-sensitive. This could be solved by converting all ids inside the # are case-sensitive. This has been solved by converting all ids
# source file to upper-case. # inside the source file to upper-case. DON'T REINTRODUCE
# - No simple way to produce pdf # lower-case IDS
# Wherever docbook.xsl and chunk.xsl live # Wherever docbook.xsl and chunk.xsl live
# Fbsd # Fbsd
@ -23,14 +28,15 @@ XSLDIR="/usr/share/xml/docbook/stylesheet/docbook-xsl/"
dochunky=1 dochunky=1
test $# -eq 1 && dochunky=0 test $# -eq 1 && dochunky=0
# Remove the SGML header and uncomment the XML one + convert from iso-8859-1 # Remove the SGML header and uncomment the XML one. Also used to iconv
# to utf-8 # from iso-8859-1 to UTF-8, but the SGML manual is now UTF-8 ? Would
# that work with the sgml toolchain ??
echo '<?xml version="1.0" encoding="UTF-8"?>' > usermanual.xml
sed -e '\!//FreeBSD//DTD!d' \ sed -e '\!//FreeBSD//DTD!d' \
-e '\!DTD DocBook XML!s/<!--//' \ -e '\!DTD DocBook XML!s/<!--//' \
-e '\!/docbookx.dtd!s/-->//' \ -e '\!/docbookx.dtd!s/-->//' \
< usermanual.sgml \ < usermanual.sgml \
| iconv -f iso-8859-1 -t utf-8 \ >> usermanual.xml
> usermanual.xml
# Options common to the single-file and chunked versions # Options common to the single-file and chunked versions
commonoptions="--stringparam section.autolabel 1 \ commonoptions="--stringparam section.autolabel 1 \
@ -59,3 +65,6 @@ eval xsltproc $commonoptions \
tidy -indent usermanual-xml.html > tmpfile tidy -indent usermanual-xml.html > tmpfile
mv -f tmpfile usermanual-xml.html mv -f tmpfile usermanual-xml.html
# And the pdf with dblatex
dblatex usermanual.xml