doc:added multithreading section

2016-05-26 10:19:46 +02:00 · 2016-05-26 10:19:46 +02:00 · a1a2bbf952
commit a1a2bbf952
parent dadf10d0ea
3 changed files with 732 additions and 366 deletions
--- a/packaging/debian/buildppa.sh
+++ b/packaging/debian/buildppa.sh
@ -19,7 +19,7 @@ case $RCLVERS in
    1.14*) PPANAME=recoll-ppa;;
    *)     PPANAME=recoll15-ppa;;
 esac
-PPANAME=recollexp-ppa
+#PPANAME=recollexp-ppa
 echo "PPA: $PPANAME. Type CR if Ok, else ^C"
 read rep
@ -42,7 +42,7 @@ check_recoll_orig()
 debdir=debian
 # Note: no new releases for lucid: no webkit. Or use old debianrclqt4 dir.
 series="precise trusty utopic vivid wily xenial"
-series=trusty
+series=
 if test "X$series" != X ; then
    check_recoll_orig
@ -141,7 +141,7 @@ done
 ### Unity Scope
 series="trusty utopic vivid wily xenial"
-series=
+series=xenial
 debdir=debianunityscope
 if test ! -d ${debdir}/ ; then
--- a/src/doc/user/usermanual.xml
+++ b/src/doc/user/usermanual.xml
@ -800,6 +800,103 @@ indexedmimetypes = application/pdf
      </sect2>
      <sect2 id="RCL.INDEXING.CONFIG.THREADS">
        <title>Indexing thread usage configuration GUI</title>
        <para>The &RCL; indexing process 
          <command>recollindex</command> can use multiple threads to
          speed up indexing on multiprocessor systems. The work done
          to index files is divided in several stages and some of the
          stages can be executed by multiple threads. The stages are:
          <orderedlist>
            <listitem>File system walking: this is always performed by
              the main thread.</listitem>
            <listitem>File conversion and data extraction.</listitem>
            <listitem>Text processing (splitting, stemming,
            etc.)</listitem>
            <listitem>&XAP; index update.</listitem>
          </orderedlist>
        </para>
        <para>You can also read a 
          <ulink url="http://www.recoll.org/idxthreads/threadingRecoll.html">
            longer document</ulink> about the transformation of
          &RCL; indexing to multithreading.</para>
        <para>The threads configuration is controlled by two
          configuration file parameters.</para>
 	 <variablelist>
          <varlistentry><term><varname>thrQSizes</varname></term>
            <listitem><para>This variable defines the job input queues
                configuration. There are three possible queues for stages
                2, 3 and 4, and this parameter should give the queue depth
                for each stage (three integer values). If a value of -1 is
                used for a given stage, no queue is used, and the thread
                will go on performing the next stage. In practise, deep
                queues have not been shown to increase performance. A value
                of 0 for the first queue tells &RCL; to perform
                autoconfiguration (no need for anything else in this case,
                thrTCounts is not used) - this is the default
                configuration.</para>
            </listitem>
          </varlistentry>
          <varlistentry><term><varname>thrTCounts</varname></term>
            <listitem><para>This defines the number of threads used
                for each stage. If a value of -1 is used for one of
                the queue depths, the corresponding thread count is
                ignored. It makes no sense to use a value other than 1
                for the last stage because updating the &XAP; index is
                necessarily single-threaded (and protected by a
                mutex).</para>
            </listitem>
          </varlistentry>
         </variablelist>
         <para>The following example would use three queues (of depth 2),
         and 4 threads for converting source documents, 2 for
         processing their text, and one to update the index. This was
         tested to be the best configuration on the test system
         (quadri-processor with multiple disks).
 <programlisting>
 thrQSizes = 2 2 2
 thrTCounts =  4 2 1
 </programlisting>
         </para>
         <para>The following example would use a single queue, and the
           complete processing for each document would be performed by
           a single thread (several documents will still be processed
           in parallel in most cases). The threads will use mutual
           exclusion when entering the index update stage. In practise
           the performance would be close to the precedent case in
           general, but worse in certain cases (e.g. a Zip archive
           would be performed purely sequentially), so the previous
           approach is preferred. YMMV...  The 2 last values for
           thrTCounts are ignored.
 <programlisting>
 thrQSizes = 2 -1 -1
 thrTCounts =  6 1 1
 </programlisting>
         </para>
         <para>The following example would disable
           multithreading. Indexing will be performed by a single
           thread.
 <programlisting>
 thrQSizes = -1 -1 -1
 </programlisting>
         </para>
         </sect2>
      <sect2 id="RCL.INDEXING.CONFIG.GUI">
        <title>The index configuration GUI</title>
--- a/src/sampleconf/recoll.conf
+++ b/src/sampleconf/recoll.conf