115 lines
2.6 KiB
Groff
115 lines
2.6 KiB
Groff
'''
|
||
''' Copyright (C) 2000, 2001, 2002 Loic Dachary <loic@senga.org>
|
||
'''
|
||
''' This program is free software; you can redistribute it and/or modify
|
||
''' it under the terms of the GNU General Public License as published by
|
||
''' the Free Software Foundation; either version 2 of the License, or
|
||
''' (at your option) any later version.
|
||
'''
|
||
''' This program is distributed in the hope that it will be useful,
|
||
''' but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||
''' MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||
''' GNU General Public License for more details.
|
||
'''
|
||
''' You should have received a copy of the GNU General Public License
|
||
''' along with this program; if not, write to the Free Software
|
||
''' Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||
'''
|
||
.TH unaccent 1 local
|
||
.SH NAME
|
||
unaccent \- remove accents from input stream or a string
|
||
|
||
.SH SYNOPSIS
|
||
unaccent [--debug_low] [--debug_high] [-h] charset [string] [expected]
|
||
.SH DESCRIPTION
|
||
With a single argument,
|
||
.I unaccent
|
||
reads data from stdin, replaces accented letters by their unaccented
|
||
equivalent and writes the result on stdout.
|
||
If the second argument
|
||
.B ('string')
|
||
is provided
|
||
.I unaccent
|
||
transforms it by replacing accented letters by their unaccented
|
||
equivalent. The
|
||
result is printed on the standard output.
|
||
The charset of the input
|
||
string or the data read from stdin is specified by the
|
||
.B 'charset'
|
||
argument (ISO-8859-15 for instance). The output is printed using the same charset.
|
||
.P
|
||
If the
|
||
.B 'expected'
|
||
argument is provided, the output string is compared to it. If they
|
||
are not equal
|
||
.I unaccent
|
||
exits on error.
|
||
.P
|
||
.I unaccent
|
||
relies on the
|
||
.B iconv(3)
|
||
library to convert from the specified charset to UTF-16BE (or UTF-16
|
||
if UTF-16BE is not available). You should check the manual pages for
|
||
available charsets. On GNU/Linux the command
|
||
|
||
.nf
|
||
.ft CW
|
||
iconv -l
|
||
.ft R
|
||
.fi
|
||
|
||
shows all available charsets.
|
||
|
||
.SH OPTIONS
|
||
.TP
|
||
.B --debug_low
|
||
Prints human readable information about the unaccentuation process. See
|
||
.B unac(3)
|
||
for more information.
|
||
|
||
.TP
|
||
.B --debug_high
|
||
Prints very detailed information about the unaccentuation process.
|
||
See
|
||
.B unac(3)
|
||
for more information.
|
||
|
||
.TP
|
||
.B --help -h
|
||
Prints a short usage and exits.
|
||
|
||
.SH EXAMPLES
|
||
Remove accents from the string
|
||
.B <EFBFBD>t<EFBFBD>
|
||
and check that the result is
|
||
.B ete.
|
||
|
||
.nf
|
||
.ft CW
|
||
unaccent ISO-8859-1 <20>t<EFBFBD> ete
|
||
.ft R
|
||
.fi
|
||
|
||
.P
|
||
Remove accents from file
|
||
.B myfile
|
||
and put the result in file
|
||
.B myfile.unaccent
|
||
|
||
.nf
|
||
.ft CW
|
||
unaccent ISO-8859-1 < myfile > myfile.unaccent
|
||
.ft R
|
||
.fi
|
||
|
||
.SH SEE ALSO
|
||
unac(3), iconv(3)
|
||
|
||
.SH AUTHOR
|
||
Loic Dachary loic@senga.org
|
||
.nf
|
||
.ft CW
|
||
http://www.senga.org/unac/
|
||
.ft R
|
||
.fi
|
||
|