diff --git a/.hgignore b/.hgignore
index eb520974..d7b3277a 100644
--- a/.hgignore
+++ b/.hgignore
@@ -21,6 +21,8 @@ src/desktop/unity-lens-recoll/data/recoll.lens
src/desktop/unity-lens-recoll/data/unity-lens-recoll.service
src/doc/user/HTML.manifest
src/doc/user/RCL.INDEXING.CONFIG.html
+src/doc/user/RCL.INDEXING.EXTATTR.html
+src/doc/user/RCL.INDEXING.EXTTAGS.html
src/doc/user/RCL.INDEXING.MONITOR.html
src/doc/user/RCL.INDEXING.PERIODIC.html
src/doc/user/RCL.INDEXING.STORAGE.html
diff --git a/src/doc/user/usermanual.sgml b/src/doc/user/usermanual.sgml
index facf7114..0cbcf8aa 100644
--- a/src/doc/user/usermanual.sgml
+++ b/src/doc/user/usermanual.sgml
@@ -690,7 +690,7 @@ recoll
- Index WEB visited page history
+ Indexing WEB pages you wisitWith the help of a Firefox
extension, &RCL; can index the Internet pages that you visit. The
@@ -723,6 +723,58 @@ recoll
+
+ Extended attributes data
+
+ User extended attributes are named pieces of information
+ that most modern file systems can attach to any file.
+
+ &RCL; versions 1.19 and later process extended attributes
+ as document fields by default. For older versions, this has to
+ be activated at build time.
+
+ A
+
+ freedesktop standard defines a few special
+ attributes, which are handled as such by &RCL;:
+
+
+ mime_type
+ If set, this overrides any other
+ determination of the file mime type.
+
+
+ charset
+ If set, this defines the file character set
+ (mostly useful for plain text files).
+
+
+
+
+ By default, other attributes are handled as &RCL; fields.
+ On Linux, the user prefix is removed from
+ the name. This can be configured more precisely inside
+ the
+ fields configuration file.
+
+
+
+
+
+ Importing external tags
+
+ During indexing, it is possible to import metadata for
+ each file by executing commands. For example, this could
+ extract user tag data for the file and store it in a field for
+ indexing.
+
+ See the
+ section
+ about the metadatacmds field in
+ the main configuration chapter for more detail.
+
+
+
Periodic indexing
@@ -2301,21 +2353,20 @@ fvwm
where docnum (%N) expands to the document
number inside the result page).
- In addition to the predefined values above, all strings like
- %(fieldname) will be replaced by the value of
- the field named fieldname for this
- document. Only stored fields can be accessed in this way, the value
- of indexed but not stored fields is not known at this point in the
- search process (see field
- configuration). There are currently very few fields stored
- by default, apart from the values above (only
- author and filename), so this
- feature will need some custom local configuration to be useful. For
- example, you could look at the fields for the document types of
- interest (use the right-click menu inside the preview window), and
- add what you want to the list of stored fields. A candidate example
- would be the recipient field which is generated
- by the message filters.
+ In addition to the predefined values above, all strings
+ like %(fieldname) will be replaced by the
+ value of the field named fieldname for this
+ document. Only stored fields can be accessed in this way, the
+ value of indexed but not stored fields is not known at this
+ point in the search process
+ (see field
+ configuration). There are currently very few fields
+ stored by default, apart from the values above
+ (only author
+ and filename), so this feature will need
+ some custom local configuration to be useful. An example
+ candidate would be the recipient field
+ which is generated by the message filters.The default value for the paragraph format string is:
<meta name="somefield" content="Some textual data" />
+
+
+ You can embed HTML markup inside the content of custom
+ fields, for improving the display inside result lists. In this
+ case, add a (wildly non-standard) markup
+ attribute to tell &RCL; that the value is HTML and should not
+ be escaped for display.
+
+
+<meta name="somefield" markup="html" content="Some <i>textual</i> data" />
See the following section for details about configuring
@@ -3366,10 +3427,11 @@ application/x-chm = execm rclchm
author, abstract.The field values for documents can appear in several ways
- during indexing: either output by filters as
- meta fields in the HTML header section, or
- added as attributes of the Doc object when
- using the API, or again synthetized internally by &RCL;.
+ during indexing: either output by filters
+ as meta fields in the HTML header section, or
+ extracted from file extended attributes, or added as attributes
+ of the Doc object when using the API, or
+ again synthetized internally by &RCL;.The &RCL; query language allows searching for text in a
specific field.
@@ -4661,7 +4723,25 @@ unac_except_trans =
mimeview.
+
+
+ metadatacmds
+ This allows executing external commands
+ for each file and storing the output in a &RCL;
+ field. This could be used for example to index external
+ tag data. The value is a list of field names and commands,
+ don't forget an initial semi-colon. Example:
+
+[/some/area/of/the/fs]
+metadatacmds = ; tags = tmsu tags %f; otherfield = somecmd -xx %f
+
+
+
+
+
+
+
@@ -4976,6 +5056,24 @@ x-my-tag = mailmytag
+
+ Extended attributes in the fields file
+
+ &RCL; versions 1.19 and later process user extended
+ file attributes as documents fields by default.
+
+ Attributes are processed as fields of the same name,
+ after removing the user prefix on
+ Linux.
+
+ The [xattrtofields]
+ section of the fields file allows
+ specifying translations from extended attributes names to
+ &RCL; field names. An empty translation disables use of the
+ corresponding attribute data.
+
+
+
diff --git a/website/BUGS.html b/website/BUGS.html
index abb73e35..2da9f5b6 100644
--- a/website/BUGS.html
+++ b/website/BUGS.html
@@ -57,34 +57,34 @@
case-insensitive search does not work for them (e.g.:
searching for ds1820 will not find DS1820).
-
On systems such as Debian Stable which use Evince version
- 2.x (not 3.x) as PDF viewer, the default "Open" command for
- PDF files will not work. You need to edit the command:
- in Preferences->GUI configuration,
- uncheck Use desktop preferences..., then
- click Choose editor applications, and for
- application/pdf, application/postscript and text/dvi, change
- the --page-index option to --page-label.
+
On systems such as Debian Stable which use Evince version
+ 2.x (not 3.x) as PDF viewer, the default "Open" command for
+ PDF files will not work. You need to edit the command:
+ in Preferences->GUI configuration,
+ uncheck Use desktop preferences..., then
+ click Choose editor applications, and for
+ application/pdf, application/postscript and text/dvi, change
+ the --page-index option to --page-label.
-
It will sometimes happen that the result list paragraph
- format stored in the Qt preferences file will get garbled,
- causing result lists with no displayed paragraphs (the
- counts and pages are ok, the results can be seen in table
- mode, but not in list mode). The workaround is to go to
-
and erase the result paragraph format string
- (^A DEL in the text area), this will reset the string to the
- default value.
+
It will sometimes happen that the result list paragraph
+ format stored in the Qt preferences file will get garbled,
+ causing result lists with no displayed paragraphs (the
+ counts and pages are ok, the results can be seen in table
+ mode, but not in list mode). The workaround is to go to
+
and erase the result paragraph format string
+ (^A DEL in the text area), this will reset the string to the
+ default value.
-
Real time indexer: when running with gamin on FreeBSD, the
- indexer can deadlock in the gamin dialog in some
- cases.
+
Real time indexer: when running with gamin on FreeBSD, the
+ indexer can deadlock in the gamin dialog in some
+ cases.
-
After an upgrade, the recoll GUI sometimes crashes on
- startup. This is fixed by removing (back it up just in case)
- ~/.config/Recoll.org/recoll.conf, the QSettings storage for
- recoll.
+
After an upgrade, the recoll GUI sometimes crashes on
+ startup. This is fixed by removing (back it up just in case)
+ ~/.config/Recoll.org/recoll.conf, the QSettings storage for
+ recoll.