doc

2015-08-06 08:02:47 +02:00 · 2015-08-06 08:02:47 +02:00 · 8b3ea3e763
commit 8b3ea3e763
parent fdfcdbb47a
2 changed files with 19 additions and 17 deletions
--- a/website/download.html
+++ b/website/download.html
@ -121,10 +121,10 @@ subdirectory, because of all the places they're referred from
 <p><a href="recoll-1.20.6.tar.gz">recoll-1.20.6.tar.gz</a>.</p>
-<h3>Release 1.21.0</h3>
+<h3>Release 1.21.1</h3>
 <p>Not the right choice if you are after complete stability:
-<a href="recoll-1.21.0.tar.gz">recoll-1.21.0.tar.gz</a>. See what's
+<a href="recoll-1.21.1.tar.gz">recoll-1.21.1.tar.gz</a>. See what's
 new in the <a href="release-1.21.html">release notes</a>.</p>
 <!--
--- a/website/idxthreads/forkingRecoll.txt
+++ b/website/idxthreads/forkingRecoll.txt
@ -7,12 +7,12 @@
 == Introduction
-Recoll is a big process which executes many others, mostly for extracting
+The Recoll indexer, *recollindex*, is a big process which executes many
-text from documents. Some of the executed processes are quite short-lived,
+others, mostly for extracting text from documents. Some of the executed
-and the time used by the process execution machinery can actually dominate
+processes are quite short-lived, and the time used by the process execution
-the time used to translate data. This document explores possible approaches
+machinery can actually dominate the time used to translate data. This
-to improving performance without adding excessive complexity or damaging
+document explores possible approaches to improving performance without
-reliability.
+adding excessive complexity or damaging reliability.
 Studying fork/exec performance is not exactly a new venture, and there are
 many texts which address the subject. While researching, though, I found
@ -32,9 +32,10 @@ identical processes.
 space initialized from an executable file, inheriting some of the resources
 under various conditions.
-As processes became bigger the copy-before-discard operation wasted
+This was all fine with the small processes of the first Unix systems, but
-significant resources, and was optimized using two methods (at very
+as time progressed, processes became bigger and the copy-before-discard
-different points in time):
+operation was found to waste significant resources. It was optimized using
 two methods (at very different points in time):
 - The first approach was to supplement +fork()+ with the +vfork()+ call, which
   is similar but does not duplicate the address space: the new process
@ -176,7 +177,7 @@ a single thread, and +fork()+ if it ran multiple ones.
 After another careful look at the code, I could see few issues with
 using +vfork()+ in the multithreaded indexer, so this was committed. 
-The only change necessary was to get rid on an implementation of the
+The only change necessary was to get rid of an implementation of the
 lacking Linux +closefrom()+ call (used to close all open descriptors above a
 given value). The previous Recoll implementation listed the +/proc/self/fd+
 directory to look for open descriptors but this was unsafe because of of
@ -200,13 +201,14 @@ same times as the +fork()+/+vfork()+ options.
 The tests were performed on an Intel Core i5 750 (4 cores, 4 threads).
 The last line is just for the fun: *recollindex* 1.18 (single-threaded)
 needed almost 6 times as long to process the same files... 
 It would be painful to play it safe and discard the 60% reduction in
-execution time offered by using +vfork()+.
+execution time offered by using +vfork()+, so this was adopted for Recoll
 1.21. To this day, no problems were discovered, but, still crossing
 fingers...
-To this day, no problems were discovered, but, still crossing fingers...
+The last line in the table is just for the fun: *recollindex* 1.18
 (single-threaded) needed almost 6 times as long to process the same
 files...
 ////
 Objections to vfork: