for perl wizards.
Oliver Fromme
olli at lurza.secnetix.de
Fri Oct 9 18:29:45 UTC 2009
Warren Block wrote:
> Oliver Fromme wrote:
> > Warren Block wrote:
> > > Oliver Fromme wrote:
> > > > Gary Kline wrote:
> > > > >
> > > > > Whenever I save a wordpeocessoe file [OOo, say] into a
> > > > > text file, I get a slew of hex codes to indicate the char to be
> > > > > used. I'm looking for a perl one-liner or script to translate
> > > > > hex back into ', ", -- [that's a dash), and so forth. Why does
> > > > > this fail to trans the hex code to an apostrophe?
> > > > >
> > > > > perl -pi.bak -e 's/\xe2\x80\x99/'/g'
> > > >
> > > > You need to escape the inner quote character, of course.
> > > > I think sed is better suited for this task than perl.
> > >
> > > That's twice now people have suggested sed instead of perl. Why? For
> > > many uses, perl is a better sed than sed. The regex engine is far more
> > > powerful and escapes are much simpler.
> >
> > Neither powerful regexes nor escapes will help in this case.
>
> Certainly \x will not help in sed; sed doesn't have it.
Right, that's an annoying flaw in sed (it doesn't even
support the \0 syntax for octal values, which is more
standard than \x).
Normally I just type such characters literally, which
is accepted fine by sed (it is 8 bit clean).
However, in this particular case I really recommend to
use the "recode" tool (ports/conversion/recode) to convert
from UTF-8 to some other encoding. Much easier, and more
correct.
E2-80-99 (unicode 2019) isn't even a real apostrophe in
UTF-8, it's a right single quotation mark. An apostrophe
would be ASCII 27.
Maybe the OP should configure his software to not save the
file with UTF-8 encoding in the first place. I'm not an
OOo user, so I can't tell how to do that. But obviously
the OP doesn't want the file to be stored as UTF-8.
> It's possible "Mastering Regular Expressions" has influenced my thinking
> on this.
This isn't about regular expressions at all. This is
about replacing fixed strings.
Best regards
Oliver
--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart
FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd
"One of the main causes of the fall of the Roman Empire was that,
lacking zero, they had no way to indicate successful termination
of their C programs."
-- Robert Firth
More information about the freebsd-questions
mailing list