diff --git a/src/doc/user/usermanual.sgml b/src/doc/user/usermanual.sgml index bec7b81f..1a77f041 100644 --- a/src/doc/user/usermanual.sgml +++ b/src/doc/user/usermanual.sgml @@ -686,6 +686,12 @@ recoll still young, so that a certain amount of weirdness cannot be excluded. + One of the most adverse consequence of using a raw index + is that some phrase and proximity searches may become + impossible: because each term needs to be expanded, and all + combinations searched for, the multiplicative expansion may + become unmanageable. + @@ -3773,7 +3779,9 @@ or Introduction &RCL; versions after 1.11 define a Python programming - interface, both for searching and indexing. + interface, both for searching and indexing. The indexing + portion has seen little use, but the searching one is used + in the Recoll Ubuntu Unity Lens and Recoll Web UI. The API is inspired by the Python database API specification, version 1.0 for &RCL; versions up to 1.18, @@ -3797,6 +3805,13 @@ or + The normal &RCL; installer installs the Python + API along with the main code. + + When installing from a repository, and depending on the + distribution, the Python API can sometimes be found in a + separate package. + @@ -3872,8 +3887,13 @@ or - Db.setAbstractParams(maxchars, contextwords) - Set the parameters used to build snippets. + Db.setAbstractParams(maxchars, + contextwords) Set the parameters used + to build snippets (sets of keywords in context text + fragments). maxchars defines the + maximum total size of the abstract. + contextwords defines how many + terms are shown around the keyword. @@ -3932,7 +3952,7 @@ or Query.close() - Closes the connection. The object is unusable + Closes the query. The object is unusable after the call. @@ -3947,12 +3967,12 @@ or Query.getgroups() Retrieves the expanded query terms as a list - of pairs. Meaningful only after executexx - In each pair, the first entry is a list of user terms, - the second a list of query terms as derived from the - user terms and used in the Xapian Query. The size of - each list is one for simple terms, or more for group - and phrase clauses. + of pairs. Meaningful only after executexx In each + pair, the first entry is a list of user terms (of size + one for simple terms, or more for group and phrase + clauses), the second a list of query terms as derived + from the user terms and used in the Xapian + Query. @@ -4002,7 +4022,9 @@ or Query.rownumberNext index to be fetched from results. Normally increments after each fetchone() call, but can be set/reset before the - call effect seeking. Starts at 0. + call to effect seeking (equivalent to + using scroll()). Starts at + 0. @@ -4089,13 +4111,15 @@ or The rclextract module - Document content is not provided by an index query. To - access it, the data extraction part of the indexing process - must be performed (subdocument access and format - translation). This is not trivial in - general. The rclextract module currently - provides a single class which can be used to access the data - content for result documents. + Index queries do not provide document content (only a + partial and unprecise reconstruction is performed to show the + snippets text). In order to access the actual document data, + the data extraction part of the indexing process + must be performed (subdocument access and format + translation). This is not trivial in + general. The rclextract module currently + provides a single class which can be used to access the data + content for result documents. Classes @@ -4118,13 +4142,25 @@ or by ipath and return a Doc object. The doc.text field has the document text as either text/plain or - text/html according to doc.mimetype. + text/html according to doc.mimetype. The typical use + would be as follows: + +qdoc = query.fetchone() +extractor = recoll.Extractor(qdoc) +text = extractor.textextract(qdoc.ipath) + - Extractor.idoctofile() + Extractor.idoctofile(ipath, targetmtype, outfile='') Extracts document into an output file, - which can be given explicitly or will be created as a - temporary file to be deleted by the caller. + which can be given explicitly or will be created as a + temporary file to be deleted by the caller. Typical use: + +qdoc = query.fetchone() +extractor = recoll.Extractor(qdoc) +filename = extractor.idoctofile(qdoc.ipath, qdoc.mimetype) + +