doc

2013-07-09 15:53:29 +02:00 · 2013-07-09 15:53:29 +02:00 · a41a72560b
commit a41a72560b
parent e2f8db63cf
1 changed files with 58 additions and 22 deletions
--- a/src/doc/user/usermanual.sgml
+++ b/src/doc/user/usermanual.sgml
@ -686,6 +686,12 @@ recoll
          still young, so that a certain amount of weirdness cannot be
          excluded.</para> 

+        <para>One of the most adverse consequence of using a raw index
+          is that some phrase and proximity searches may become
+          impossible: because each term needs to be expanded, and all
+          combinations searched for, the multiplicative expansion may
+          become unmanageable.</para>
+
      </sect2>


@ -3773,7 +3779,9 @@ or
        <title>Introduction</title>

        <para>&RCL; versions after 1.11 define a Python programming
-          interface, both for searching and indexing.</para> 
+          interface, both for searching and indexing. The indexing
+          portion has seen little use, but the searching one is used
+          in the Recoll Ubuntu Unity Lens and Recoll Web UI.</para> 

        <para>The API is inspired by the Python database API
          specification, version 1.0 for &RCL; versions up to 1.18,
@ -3797,6 +3805,13 @@ or
          </screen>
        </para> 

+        <para>The normal &RCL; installer installs the Python
+          API along with the main code.</para>
+
+        <para>When installing from a repository, and depending on the
+          distribution, the Python API can sometimes be found in a
+          separate package.</para>
+
      </sect3>

      <sect3 id="RCL.PROGRAM.PYTHON.PACKAGE">
@ -3872,8 +3887,13 @@ or
            </varlistentry>

            <varlistentry>
-              <term>Db.setAbstractParams(maxchars, contextwords)</term>
-              <listitem>Set the parameters used to build snippets.</listitem>
+              <term>Db.setAbstractParams(maxchars,
+              contextwords)</term> <listitem>Set the parameters used
+              to build snippets (sets of keywords in context text
+              fragments). <literal>maxchars</literal> defines the
+                maximum total size of the abstract. 
+                <literal>contextwords</literal> defines how many
+                terms are shown around the keyword.</listitem>
            </varlistentry>

          </variablelist>
@ -3932,7 +3952,7 @@ or

            <varlistentry>
              <term>Query.close()</term>
-              <listitem>Closes the connection. The object is unusable
+              <listitem>Closes the query. The object is unusable
              after the call.</listitem>
            </varlistentry>

@ -3947,12 +3967,12 @@ or
            <varlistentry>
              <term>Query.getgroups()</term>
              <listitem>Retrieves the expanded query terms as a list
-              of pairs. Meaningful only after executexx
-                In each pair, the first entry is a list of user terms,
-                the second a list of query terms as derived from the
-                user terms and used in the Xapian Query. The size of
-                each list is one for simple terms, or more for group
-                and phrase clauses.</listitem>
+                of pairs. Meaningful only after executexx In each
+                pair, the first entry is a list of user terms (of size
+                one for simple terms, or more for group and phrase
+                clauses), the second a list of query terms as derived
+                from the user terms and used in the Xapian
+                Query.</listitem>
            </varlistentry>
            
            <varlistentry>
@ -4002,7 +4022,9 @@ or
            <varlistentry><term>Query.rownumber</term><listitem>Next index
                to be fetched from results. Normally increments after
                each fetchone() call, but can be set/reset before the
-                call effect seeking. Starts at 0.</listitem>
+                call to effect seeking (equivalent to
+                using <literal>scroll()</literal>). Starts at
+                0.</listitem> 
            </varlistentry>

          </variablelist>
@ -4089,13 +4111,15 @@ or
      <sect3 id="RCL.PROGRAM.PYTHON.RCLEXTRACT">
        <title>The rclextract module</title>

-        <para>Document content is not provided by an index query. To
-        access it, the data extraction part of the indexing process
-        must be performed (subdocument access and format
-        translation). This is not trivial in
-        general. The <literal>rclextract</literal> module currently
-        provides a single class which can be used to access the data
-        content for result documents.</para>
+        <para>Index queries do not provide document content (only a
+          partial and unprecise reconstruction is performed to show the
+          snippets text). In order to access the actual document data, 
+          the data extraction part of the indexing process
+          must be performed (subdocument access and format
+          translation). This is not trivial in
+          general. The <literal>rclextract</literal> module currently
+          provides a single class which can be used to access the data
+          content for result documents.</para>

        <sect4 id="RCL.PROGRAM.PYTHON.RCLEXTRACT.CLASSES">
          <title>Classes</title>
@ -4118,13 +4142,25 @@ or
                by <replaceable>ipath</replaceable> and return
                a <literal>Doc</literal> object. The doc.text field
                has the document text as either text/plain or
-                text/html according to doc.mimetype.</listitem>
+                text/html according to doc.mimetype. The typical use
+                would be as follows:
+                  <programlisting>
+qdoc = query.fetchone()
+extractor = recoll.Extractor(qdoc)
+text = extractor.textextract(qdoc.ipath)</programlisting>
+                </listitem>
              </varlistentry>
              <varlistentry>
-                <term>Extractor.idoctofile()</term>
+                <term>Extractor.idoctofile(ipath, targetmtype, outfile='')</term>
                <listitem>Extracts document into an output file,
-                which can be given explicitly or will be created as a
-                temporary file to be deleted by the caller.</listitem>
+                  which can be given explicitly or will be created as a
+                  temporary file to be deleted by the caller. Typical use:
+                  <programlisting>
+qdoc = query.fetchone()
+extractor = recoll.Extractor(qdoc)
+filename = extractor.idoctofile(qdoc.ipath, qdoc.mimetype)</programlisting>
+
+                </listitem>
              </varlistentry>

          </variablelist>