This commit is contained in:
Jean-Francois Dockes 2013-02-21 13:31:51 +01:00
parent 22d21f1685
commit defe1e780d
3 changed files with 145 additions and 45 deletions

View file

@ -21,6 +21,8 @@ src/desktop/unity-lens-recoll/data/recoll.lens
src/desktop/unity-lens-recoll/data/unity-lens-recoll.service
src/doc/user/HTML.manifest
src/doc/user/RCL.INDEXING.CONFIG.html
src/doc/user/RCL.INDEXING.EXTATTR.html
src/doc/user/RCL.INDEXING.EXTTAGS.html
src/doc/user/RCL.INDEXING.MONITOR.html
src/doc/user/RCL.INDEXING.PERIODIC.html
src/doc/user/RCL.INDEXING.STORAGE.html

View file

@ -690,7 +690,7 @@ recoll
</sect1>
<sect1 id="RCL.INDEXING.WEBQUEUE">
<title>Index WEB visited page history</title>
<title>Indexing WEB pages you wisit</title>
<para>With the help of a <application>Firefox</application>
extension, &RCL; can index the Internet pages that you visit. The
@ -723,6 +723,58 @@ recoll
</sect1>
<sect1 id="RCL.INDEXING.EXTATTR">
<title>Extended attributes data</title>
<para>User extended attributes are named pieces of information
that most modern file systems can attach to any file.</para>
<para>&RCL; versions 1.19 and later process extended attributes
as document fields by default. For older versions, this has to
be activated at build time.</para>
<para>A
<ulink url="http://www.freedesktop.org/wiki/CommonExtendedAttributes">
freedesktop standard</ulink> defines a few special
attributes, which are handled as such by &RCL;:
<variablelist>
<varlistentry>
<term>mime_type</term>
<listitem><para>If set, this overrides any other
determination of the file mime type.</para></listitem>
</varlistentry>
<varlistentry>
<term>charset</term>
<listitem>If set, this defines the file character set
(mostly useful for plain text files).</listitem>
</varlistentry>
</variablelist>
</para>
<para>By default, other attributes are handled as &RCL; fields.
On Linux, the <literal>user</literal> prefix is removed from
the name. This can be configured more precisely inside
the <link linkend="RCL.INSTALL.CONFIG.FIELDS">
<filename>fields</filename> configuration file</link>.
</para>
</sect1>
<sect1 id="RCL.INDEXING.EXTTAGS">
<title>Importing external tags</title>
<para>During indexing, it is possible to import metadata for
each file by executing commands. For example, this could
extract user tag data for the file and store it in a field for
indexing.</para>
<para>See the
<link linkend="RCL.INSTALL.CONFIG.RECOLLCONF.METADATACMDS">section
about the <literal>metadatacmds</literal> field</link> in
the main configuration chapter for more detail.</para>
</sect1>
<sect1 id="RCL.INDEXING.PERIODIC">
<title>Periodic indexing</title>
@ -2301,21 +2353,20 @@ fvwm
where <replaceable>docnum</replaceable> (%N) expands to the document
number inside the result page).</para>
<para>In addition to the predefined values above, all strings like
<literal>%(fieldname)</literal> will be replaced by the value of
the field named <literal>fieldname</literal> for this
document. Only stored fields can be accessed in this way, the value
of indexed but not stored fields is not known at this point in the
search process (see <link linkend="RCL.PROGRAM.FIELDS">field
configuration</link>). There are currently very few fields stored
by default, apart from the values above (only
<literal>author</literal> and <literal>filename</literal>), so this
feature will need some custom local configuration to be useful. For
example, you could look at the fields for the document types of
interest (use the right-click menu inside the preview window), and
add what you want to the list of stored fields. A candidate example
would be the <literal>recipient</literal> field which is generated
by the message filters.</para>
<para>In addition to the predefined values above, all strings
like <literal>%(fieldname)</literal> will be replaced by the
value of the field named <literal>fieldname</literal> for this
document. Only stored fields can be accessed in this way, the
value of indexed but not stored fields is not known at this
point in the search process
(see <link linkend="RCL.PROGRAM.FIELDS">field
configuration</link>). There are currently very few fields
stored by default, apart from the values above
(only <literal>author</literal>
and <literal>filename</literal>), so this feature will need
some custom local configuration to be useful. An example
candidate would be the <literal>recipient</literal> field
which is generated by the message filters.</para>
<para>The default value for the paragraph format string is:
<screen><![CDATA[
@ -3338,6 +3389,16 @@ application/x-chm = execm rclchm
<programlisting>
&lt;meta name="somefield" content="Some textual data" /&gt;
</programlisting>
<para>You can embed HTML markup inside the content of custom
fields, for improving the display inside result lists. In this
case, add a (wildly non-standard) <literal>markup</literal>
attribute to tell &RCL; that the value is HTML and should not
be escaped for display.</para>
<programlisting>
&lt;meta name="somefield" markup="html" content="Some &lt;i>textual&lt;/i> data" /&gt;
</programlisting>
<para> See the following section for details about configuring
@ -3366,10 +3427,11 @@ application/x-chm = execm rclchm
<literal>author</literal>, <literal>abstract</literal>.</para>
<para>The field values for documents can appear in several ways
during indexing: either output by filters as
<literal>meta</literal> fields in the HTML header section, or
added as attributes of the <literal>Doc</literal> object when
using the API, or again synthetized internally by &RCL;.</para>
during indexing: either output by filters
as <literal>meta</literal> fields in the HTML header section, or
extracted from file extended attributes, or added as attributes
of the <literal>Doc</literal> object when using the API, or
again synthetized internally by &RCL;.</para>
<para>The &RCL; query language allows searching for text in a
specific field.</para>
@ -4661,7 +4723,25 @@ unac_except_trans =
<filename>mimeview</filename>.</para>
</listitem>
</varlistentry>
<varlistentry id="RCL.INSTALL.CONFIG.RECOLLCONF.METADATACMDS">
<term><varname>metadatacmds</varname></term>
<listitem><para>This allows executing external commands
for each file and storing the output in a &RCL;
field. This could be used for example to index external
tag data. The value is a list of field names and commands,
don't forget an initial semi-colon. Example:
<programlisting>
[/some/area/of/the/fs]
metadatacmds = ; tags = tmsu tags %f; otherfield = somecmd -xx %f
</programlisting>
</para>
</listitem>
</varlistentry>
</variablelist>
</sect3>
<sect3 id="RCL.INSTALL.CONFIG.RECOLLCONF.STORAGE">
@ -4976,6 +5056,24 @@ x-my-tag = mailmytag
</para>
<sect3 id="RCL.INSTALL.CONFIG.FIELDS.XATTR">
<title>Extended attributes in the fields file</title>
<para>&RCL; versions 1.19 and later process user extended
file attributes as documents fields by default.</para>
<para>Attributes are processed as fields of the same name,
after removing the <literal>user</literal> prefix on
Linux.</para>
<para>The <literal>[xattrtofields]</literal>
section of the <filename>fields</filename> file allows
specifying translations from extended attributes names to
&RCL; field names. An empty translation disables use of the
corresponding attribute data.</para>
</sect3>
</sect2>
<sect2 id="RCL.INSTALL.CONFIG.MIMEMAP">