Using Synonyms (Recoll 1.22 and later)

There are a number of different uses for synonyms in text search. They can be used at index time (either to increase or decrease the number of indexed terms), or at query time, to reduce user terms to a set of canonical ones, or to expand queries to match texts containing synonyms of the user terms.

Only the last approach is used in Recoll. Synonym groups can be defined so that a user query term which is found to be part of a synonym group will be optionally expanded into an OR query for all synonyms.

What is it good for ? The synonyms function is probably not going to help you find your letters to Mr. Smith. It is best used for domain-specific searches. For example, it was initially suggested by a user performing searches among historical documents: the synonyms file would contains nicknames and aliases for each of the persons of interest.

In practise, synonym groups are defined inside ordinary text files. Each line in the file defines a group. Example:

hi hello "good morning"

# not sure about "au revoir" though. Is this english ?
bye goodbye "see you" \
  "au revoir" 
    

As usual lines beginning with a # are comments, empty lines are ignored, and lines can be continued by ending them with a backslash.

The synonyms are searched for matches with user terms after these are stem-expanded, but the contents of the synonyms file itself is not subjected to stem expansion (1.22). This means that a match will not be found if the form present in the synonyms file is not present anywhere in the document set.

Multi-word synonyms are supported, but be aware that these will generate phrase queries, which may degrade performance (and also, no stemming).

A synonyms file can be specified in the GUI preferences, or as an option to recollq.

This feature is new in Recoll 1.22 and will probably need to be refined after some user feedback.