doc
This commit is contained in:
parent
e2f8db63cf
commit
a41a72560b
1 changed files with 58 additions and 22 deletions
|
@ -686,6 +686,12 @@ recoll
|
||||||
still young, so that a certain amount of weirdness cannot be
|
still young, so that a certain amount of weirdness cannot be
|
||||||
excluded.</para>
|
excluded.</para>
|
||||||
|
|
||||||
|
<para>One of the most adverse consequence of using a raw index
|
||||||
|
is that some phrase and proximity searches may become
|
||||||
|
impossible: because each term needs to be expanded, and all
|
||||||
|
combinations searched for, the multiplicative expansion may
|
||||||
|
become unmanageable.</para>
|
||||||
|
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
|
|
||||||
|
@ -3773,7 +3779,9 @@ or
|
||||||
<title>Introduction</title>
|
<title>Introduction</title>
|
||||||
|
|
||||||
<para>&RCL; versions after 1.11 define a Python programming
|
<para>&RCL; versions after 1.11 define a Python programming
|
||||||
interface, both for searching and indexing.</para>
|
interface, both for searching and indexing. The indexing
|
||||||
|
portion has seen little use, but the searching one is used
|
||||||
|
in the Recoll Ubuntu Unity Lens and Recoll Web UI.</para>
|
||||||
|
|
||||||
<para>The API is inspired by the Python database API
|
<para>The API is inspired by the Python database API
|
||||||
specification, version 1.0 for &RCL; versions up to 1.18,
|
specification, version 1.0 for &RCL; versions up to 1.18,
|
||||||
|
@ -3797,6 +3805,13 @@ or
|
||||||
</screen>
|
</screen>
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
<para>The normal &RCL; installer installs the Python
|
||||||
|
API along with the main code.</para>
|
||||||
|
|
||||||
|
<para>When installing from a repository, and depending on the
|
||||||
|
distribution, the Python API can sometimes be found in a
|
||||||
|
separate package.</para>
|
||||||
|
|
||||||
</sect3>
|
</sect3>
|
||||||
|
|
||||||
<sect3 id="RCL.PROGRAM.PYTHON.PACKAGE">
|
<sect3 id="RCL.PROGRAM.PYTHON.PACKAGE">
|
||||||
|
@ -3872,8 +3887,13 @@ or
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
<varlistentry>
|
<varlistentry>
|
||||||
<term>Db.setAbstractParams(maxchars, contextwords)</term>
|
<term>Db.setAbstractParams(maxchars,
|
||||||
<listitem>Set the parameters used to build snippets.</listitem>
|
contextwords)</term> <listitem>Set the parameters used
|
||||||
|
to build snippets (sets of keywords in context text
|
||||||
|
fragments). <literal>maxchars</literal> defines the
|
||||||
|
maximum total size of the abstract.
|
||||||
|
<literal>contextwords</literal> defines how many
|
||||||
|
terms are shown around the keyword.</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
</variablelist>
|
</variablelist>
|
||||||
|
@ -3932,7 +3952,7 @@ or
|
||||||
|
|
||||||
<varlistentry>
|
<varlistentry>
|
||||||
<term>Query.close()</term>
|
<term>Query.close()</term>
|
||||||
<listitem>Closes the connection. The object is unusable
|
<listitem>Closes the query. The object is unusable
|
||||||
after the call.</listitem>
|
after the call.</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
|
@ -3947,12 +3967,12 @@ or
|
||||||
<varlistentry>
|
<varlistentry>
|
||||||
<term>Query.getgroups()</term>
|
<term>Query.getgroups()</term>
|
||||||
<listitem>Retrieves the expanded query terms as a list
|
<listitem>Retrieves the expanded query terms as a list
|
||||||
of pairs. Meaningful only after executexx
|
of pairs. Meaningful only after executexx In each
|
||||||
In each pair, the first entry is a list of user terms,
|
pair, the first entry is a list of user terms (of size
|
||||||
the second a list of query terms as derived from the
|
one for simple terms, or more for group and phrase
|
||||||
user terms and used in the Xapian Query. The size of
|
clauses), the second a list of query terms as derived
|
||||||
each list is one for simple terms, or more for group
|
from the user terms and used in the Xapian
|
||||||
and phrase clauses.</listitem>
|
Query.</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
<varlistentry>
|
<varlistentry>
|
||||||
|
@ -4002,7 +4022,9 @@ or
|
||||||
<varlistentry><term>Query.rownumber</term><listitem>Next index
|
<varlistentry><term>Query.rownumber</term><listitem>Next index
|
||||||
to be fetched from results. Normally increments after
|
to be fetched from results. Normally increments after
|
||||||
each fetchone() call, but can be set/reset before the
|
each fetchone() call, but can be set/reset before the
|
||||||
call effect seeking. Starts at 0.</listitem>
|
call to effect seeking (equivalent to
|
||||||
|
using <literal>scroll()</literal>). Starts at
|
||||||
|
0.</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
</variablelist>
|
</variablelist>
|
||||||
|
@ -4089,8 +4111,10 @@ or
|
||||||
<sect3 id="RCL.PROGRAM.PYTHON.RCLEXTRACT">
|
<sect3 id="RCL.PROGRAM.PYTHON.RCLEXTRACT">
|
||||||
<title>The rclextract module</title>
|
<title>The rclextract module</title>
|
||||||
|
|
||||||
<para>Document content is not provided by an index query. To
|
<para>Index queries do not provide document content (only a
|
||||||
access it, the data extraction part of the indexing process
|
partial and unprecise reconstruction is performed to show the
|
||||||
|
snippets text). In order to access the actual document data,
|
||||||
|
the data extraction part of the indexing process
|
||||||
must be performed (subdocument access and format
|
must be performed (subdocument access and format
|
||||||
translation). This is not trivial in
|
translation). This is not trivial in
|
||||||
general. The <literal>rclextract</literal> module currently
|
general. The <literal>rclextract</literal> module currently
|
||||||
|
@ -4118,13 +4142,25 @@ or
|
||||||
by <replaceable>ipath</replaceable> and return
|
by <replaceable>ipath</replaceable> and return
|
||||||
a <literal>Doc</literal> object. The doc.text field
|
a <literal>Doc</literal> object. The doc.text field
|
||||||
has the document text as either text/plain or
|
has the document text as either text/plain or
|
||||||
text/html according to doc.mimetype.</listitem>
|
text/html according to doc.mimetype. The typical use
|
||||||
|
would be as follows:
|
||||||
|
<programlisting>
|
||||||
|
qdoc = query.fetchone()
|
||||||
|
extractor = recoll.Extractor(qdoc)
|
||||||
|
text = extractor.textextract(qdoc.ipath)</programlisting>
|
||||||
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
<varlistentry>
|
<varlistentry>
|
||||||
<term>Extractor.idoctofile()</term>
|
<term>Extractor.idoctofile(ipath, targetmtype, outfile='')</term>
|
||||||
<listitem>Extracts document into an output file,
|
<listitem>Extracts document into an output file,
|
||||||
which can be given explicitly or will be created as a
|
which can be given explicitly or will be created as a
|
||||||
temporary file to be deleted by the caller.</listitem>
|
temporary file to be deleted by the caller. Typical use:
|
||||||
|
<programlisting>
|
||||||
|
qdoc = query.fetchone()
|
||||||
|
extractor = recoll.Extractor(qdoc)
|
||||||
|
filename = extractor.idoctofile(qdoc.ipath, qdoc.mimetype)</programlisting>
|
||||||
|
|
||||||
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
</variablelist>
|
</variablelist>
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue