merge
This commit is contained in:
commit
e2e9b116fe
3 changed files with 34 additions and 38 deletions
|
@ -411,9 +411,13 @@
|
||||||
|
|
||||||
<h3>Updated 1.20/21 translations that became available after the release:</h3>
|
<h3>Updated 1.20/21 translations that became available after the release:</h3>
|
||||||
|
|
||||||
|
<p>A new Hungarian translation by Somogyvári Róbert:
|
||||||
|
<a href="translations/recoll_hu.ts">recoll_hu.ts</a>
|
||||||
|
<a href="translations/recoll_hu.qm">recoll_hu.qm</a><br/>
|
||||||
|
</p>
|
||||||
<p>An updated Czech translation by Pavel Fric:
|
<p>An updated Czech translation by Pavel Fric:
|
||||||
<a href="translations/recoll_cs.ts">recoll_da.ts</a>
|
<a href="translations/recoll_cs.ts">recoll_cs.ts</a>
|
||||||
<a href="translations/recoll_cs.qm">recoll_da.qm</a><br/>
|
<a href="translations/recoll_cs.qm">recoll_cs.qm</a><br/>
|
||||||
</p>
|
</p>
|
||||||
<p>A Danish translation by Morten Langlo:
|
<p>A Danish translation by Morten Langlo:
|
||||||
<a href="translations/recoll_da.ts">recoll_da.ts</a>
|
<a href="translations/recoll_da.ts">recoll_da.ts</a>
|
||||||
|
|
|
@ -3,7 +3,7 @@
|
||||||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
|
||||||
<head>
|
<head>
|
||||||
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
|
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
|
||||||
<meta name="generator" content="AsciiDoc 8.6.7" />
|
<meta name="generator" content="AsciiDoc 8.6.9" />
|
||||||
<title>Converting Recoll indexing to multithreading</title>
|
<title>Converting Recoll indexing to multithreading</title>
|
||||||
<style type="text/css">
|
<style type="text/css">
|
||||||
/* Shared CSS for AsciiDoc xhtml11 and html5 backends */
|
/* Shared CSS for AsciiDoc xhtml11 and html5 backends */
|
||||||
|
@ -87,10 +87,16 @@ ul, ol, li > p {
|
||||||
ul > li { color: #aaa; }
|
ul > li { color: #aaa; }
|
||||||
ul > li > * { color: black; }
|
ul > li > * { color: black; }
|
||||||
|
|
||||||
pre {
|
.monospaced, code, pre {
|
||||||
|
font-family: "Courier New", Courier, monospace;
|
||||||
|
font-size: inherit;
|
||||||
|
color: navy;
|
||||||
padding: 0;
|
padding: 0;
|
||||||
margin: 0;
|
margin: 0;
|
||||||
}
|
}
|
||||||
|
pre {
|
||||||
|
white-space: pre-wrap;
|
||||||
|
}
|
||||||
|
|
||||||
#author {
|
#author {
|
||||||
color: #527bbd;
|
color: #527bbd;
|
||||||
|
@ -219,7 +225,7 @@ div.exampleblock > div.content {
|
||||||
}
|
}
|
||||||
|
|
||||||
div.imageblock div.content { padding-left: 0; }
|
div.imageblock div.content { padding-left: 0; }
|
||||||
span.image img { border-style: none; }
|
span.image img { border-style: none; vertical-align: text-bottom; }
|
||||||
a.image:visited { color: white; }
|
a.image:visited { color: white; }
|
||||||
|
|
||||||
dl {
|
dl {
|
||||||
|
@ -415,12 +421,6 @@ div.unbreakable { page-break-inside: avoid; }
|
||||||
*
|
*
|
||||||
* */
|
* */
|
||||||
|
|
||||||
tt {
|
|
||||||
font-family: "Courier New", Courier, monospace;
|
|
||||||
font-size: inherit;
|
|
||||||
color: navy;
|
|
||||||
}
|
|
||||||
|
|
||||||
div.tableblock {
|
div.tableblock {
|
||||||
margin-top: 1.0em;
|
margin-top: 1.0em;
|
||||||
margin-bottom: 1.5em;
|
margin-bottom: 1.5em;
|
||||||
|
@ -454,12 +454,6 @@ div.tableblock > table[frame="vsides"] {
|
||||||
*
|
*
|
||||||
* */
|
* */
|
||||||
|
|
||||||
.monospaced {
|
|
||||||
font-family: "Courier New", Courier, monospace;
|
|
||||||
font-size: inherit;
|
|
||||||
color: navy;
|
|
||||||
}
|
|
||||||
|
|
||||||
table.tableblock {
|
table.tableblock {
|
||||||
margin-top: 1.0em;
|
margin-top: 1.0em;
|
||||||
margin-bottom: 1.5em;
|
margin-bottom: 1.5em;
|
||||||
|
@ -539,6 +533,8 @@ body.manpage div.sectionbody {
|
||||||
@media print {
|
@media print {
|
||||||
body.manpage div#toc { display: none; }
|
body.manpage div#toc { display: none; }
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
</style>
|
</style>
|
||||||
<script type="text/javascript">
|
<script type="text/javascript">
|
||||||
/*<![CDATA[*/
|
/*<![CDATA[*/
|
||||||
|
@ -739,7 +735,7 @@ asciidoc.install();
|
||||||
<div id="header">
|
<div id="header">
|
||||||
<h1>Converting Recoll indexing to multithreading</h1>
|
<h1>Converting Recoll indexing to multithreading</h1>
|
||||||
<span id="author">Jean-François Dockès</span><br />
|
<span id="author">Jean-François Dockès</span><br />
|
||||||
<span id="email"><tt><<a href="mailto:jfd@recoll.org">jfd@recoll.org</a>></tt></span><br />
|
<span id="email"><code><<a href="mailto:jfd@recoll.org">jfd@recoll.org</a>></code></span><br />
|
||||||
<span id="revdate">2012-12-03</span>
|
<span id="revdate">2012-12-03</span>
|
||||||
</div>
|
</div>
|
||||||
<div id="content">
|
<div id="content">
|
||||||
|
@ -785,7 +781,7 @@ trouble though, and linking the GUI and indexing process lifetimes was a
|
||||||
bad idea, so, in recent versions, the indexing is always performed by an
|
bad idea, so, in recent versions, the indexing is always performed by an
|
||||||
external process. Still, this experience had put in light most of the
|
external process. Still, this experience had put in light most of the
|
||||||
problem areas, and prepared the code for further work.</p></div>
|
problem areas, and prepared the code for further work.</p></div>
|
||||||
<div class="paragraph"><p>It should be noted that, as <tt>recollindex</tt> is both <em>nice</em>'d and <em>ionice</em>'d
|
<div class="paragraph"><p>It should be noted that, as <code>recollindex</code> is both <em>nice</em>'d and <em>ionice</em>'d
|
||||||
as a lowest priority process, it will only use free computing power on the
|
as a lowest priority process, it will only use free computing power on the
|
||||||
machine, and will step down as soon as anything else wants to work.</p></div>
|
machine, and will step down as soon as anything else wants to work.</p></div>
|
||||||
<div class="sidebarblock">
|
<div class="sidebarblock">
|
||||||
|
@ -800,7 +796,7 @@ on the document sizes). May I also suggest in this case that, if your
|
||||||
machine can take more memory, it may be a good idea to procure some, as
|
machine can take more memory, it may be a good idea to procure some, as
|
||||||
memory is nowadays quite cheap, and memory-starved machines are not fun.</p></div>
|
memory is nowadays quite cheap, and memory-starved machines are not fun.</p></div>
|
||||||
</div></div>
|
</div></div>
|
||||||
<div class="paragraph"><p>In general, augmenting the machine utilisation by <tt>recollindex</tt> just does
|
<div class="paragraph"><p>In general, augmenting the machine utilisation by <code>recollindex</code> just does
|
||||||
not change its responsiveness. My PC has a an Intel Pentium Core i5 750 (4
|
not change its responsiveness. My PC has a an Intel Pentium Core i5 750 (4
|
||||||
cores, no hyperthreading), which is far from being a high performance CPU
|
cores, no hyperthreading), which is far from being a high performance CPU
|
||||||
(nowadays…), and I often forget that I am running indexing tests, it is
|
(nowadays…), and I often forget that I am running indexing tests, it is
|
||||||
|
@ -815,7 +811,7 @@ just not noticeable. The machine does have a lot of memory though (12GB).</p></d
|
||||||
<img src="nothreads.png" alt="Basic flow" />
|
<img src="nothreads.png" alt="Basic flow" />
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<div class="paragraph"><p>There are 4 main steps in the <tt>recollindex</tt> processing pipeline:</p></div>
|
<div class="paragraph"><p>There are 4 main steps in the <code>recollindex</code> processing pipeline:</p></div>
|
||||||
<div class="olist arabic"><ol class="arabic">
|
<div class="olist arabic"><ol class="arabic">
|
||||||
<li>
|
<li>
|
||||||
<p>
|
<p>
|
||||||
|
@ -1056,8 +1052,8 @@ experiment. For example, the following data defines the configuration that
|
||||||
was finally found to be best overall on my hardware:</p></div>
|
was finally found to be best overall on my hardware:</p></div>
|
||||||
<div class="literalblock">
|
<div class="literalblock">
|
||||||
<div class="content">
|
<div class="content">
|
||||||
<pre><tt>thrQSizes = 2 2 2
|
<pre><code>thrQSizes = 2 2 2
|
||||||
thrTCounts = 4 2 1</tt></pre>
|
thrTCounts = 4 2 1</code></pre>
|
||||||
</div></div>
|
</div></div>
|
||||||
<div class="paragraph"><p>This is using 3 queues of depth 2, 4 threads working on file conversion, 2
|
<div class="paragraph"><p>This is using 3 queues of depth 2, 4 threads working on file conversion, 2
|
||||||
on text splitting and other document processing, and 1 on Xapian updating
|
on text splitting and other document processing, and 1 on Xapian updating
|
||||||
|
@ -1070,11 +1066,9 @@ on text splitting and other document processing, and 1 on Xapian updating
|
||||||
<div class="sectionbody">
|
<div class="sectionbody">
|
||||||
<div class="paragraph"><p>So the big question after all the work: was it worth it ? I could only get
|
<div class="paragraph"><p>So the big question after all the work: was it worth it ? I could only get
|
||||||
a real answer when the program stopped crashing, so this took some time and
|
a real answer when the program stopped crashing, so this took some time and
|
||||||
a little faith…</p></div>
|
a little faith, but the answer is positive, as far as I’m
|
||||||
<div class="paragraph"><p>The answer is mostly yes, as far as I’m concerned. Indexing tests running
|
concerned. Performance has improved significantly and this was a fun
|
||||||
almost twice as fast are good for my blood pressure and I don’t need a
|
project.</p></div>
|
||||||
faster PC, I’ll buy more red wine instead (good for my health too, or maybe
|
|
||||||
not). And it was a fun project anyway.</p></div>
|
|
||||||
<div class="tableblock">
|
<div class="tableblock">
|
||||||
<table rules="all"
|
<table rules="all"
|
||||||
width="70%"
|
width="70%"
|
||||||
|
@ -1221,8 +1215,8 @@ writable <strong>Xapian</strong> database).</p></div>
|
||||||
parameters (one can also use a deeper front queue, this changes little):</p></div>
|
parameters (one can also use a deeper front queue, this changes little):</p></div>
|
||||||
<div class="literalblock">
|
<div class="literalblock">
|
||||||
<div class="content">
|
<div class="content">
|
||||||
<pre><tt>thrQSizes = 2 -1 -1
|
<pre><code>thrQSizes = 2 -1 -1
|
||||||
thrTCounts = 4 0 0</tt></pre>
|
thrTCounts = 4 0 0</code></pre>
|
||||||
</div></div>
|
</div></div>
|
||||||
<div class="paragraph"><p>In practise, the performance is close to the one for the multistage
|
<div class="paragraph"><p>In practise, the performance is close to the one for the multistage
|
||||||
version.</p></div>
|
version.</p></div>
|
||||||
|
@ -1267,12 +1261,12 @@ was over.</p></div>
|
||||||
<div class="sect2">
|
<div class="sect2">
|
||||||
<h3 id="_fork_performance_issues">Fork performance issues</h3>
|
<h3 id="_fork_performance_issues">Fork performance issues</h3>
|
||||||
<div class="paragraph"><p>On a quite unrelated note, something that I discovered while evaluating the
|
<div class="paragraph"><p>On a quite unrelated note, something that I discovered while evaluating the
|
||||||
program performance is that forking a big process like <tt>recollindex</tt> can be
|
program performance is that forking a big process like <code>recollindex</code> can be
|
||||||
quite expensive. Even if the memory space of the forked process is not
|
quite expensive. Even if the memory space of the forked process is not
|
||||||
copied (it’s Copy On Write, and we write very little before the following
|
copied (it’s Copy On Write, and we write very little before the following
|
||||||
exec), just duplicating the memory maps can be slow when the process uses a
|
exec), just duplicating the memory maps can be slow when the process uses a
|
||||||
few hundred megabytes.</p></div>
|
few hundred megabytes.</p></div>
|
||||||
<div class="paragraph"><p>I modified the single-threaded version of <tt>recollindex</tt> to use <strong>vfork</strong>
|
<div class="paragraph"><p>I modified the single-threaded version of <code>recollindex</code> to use <strong>vfork</strong>
|
||||||
instead of <strong>fork</strong>, but this can’t be used with multiple threads (no
|
instead of <strong>fork</strong>, but this can’t be used with multiple threads (no
|
||||||
modification of the process memory space is allowed in the child between
|
modification of the process memory space is allowed in the child between
|
||||||
<strong>vfork</strong> and <strong>exec</strong>, so we’d have to have a way to suspend all the threads
|
<strong>vfork</strong> and <strong>exec</strong>, so we’d have to have a way to suspend all the threads
|
||||||
|
@ -1289,7 +1283,7 @@ the executing of ephemeral external commands.</p></div>
|
||||||
<div id="footnotes"><hr /></div>
|
<div id="footnotes"><hr /></div>
|
||||||
<div id="footer">
|
<div id="footer">
|
||||||
<div id="footer-text">
|
<div id="footer-text">
|
||||||
Last updated 2012-12-14 15:55:12 CET
|
Last updated 2016-05-08 08:30:29 CEST
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</body>
|
</body>
|
||||||
|
|
|
@ -279,12 +279,10 @@ unfloat::[]
|
||||||
|
|
||||||
So the big question after all the work: was it worth it ? I could only get
|
So the big question after all the work: was it worth it ? I could only get
|
||||||
a real answer when the program stopped crashing, so this took some time and
|
a real answer when the program stopped crashing, so this took some time and
|
||||||
a little faith...
|
a little faith, but the answer is positive, as far as I'm
|
||||||
|
concerned. Performance has improved significantly and this was a fun
|
||||||
|
project.
|
||||||
|
|
||||||
The answer is mostly yes, as far as I'm concerned. Indexing tests running
|
|
||||||
almost twice as fast are good for my blood pressure and I don't need a
|
|
||||||
faster PC, I'll buy more red wine instead (good for my health too, or maybe
|
|
||||||
not). And it was a fun project anyway.
|
|
||||||
|
|
||||||
.Results on a variety of file system areas:
|
.Results on a variety of file system areas:
|
||||||
[options="header", width="70%"]
|
[options="header", width="70%"]
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue