converting UTF-8 to HTML
Matthias Apitz
guru at unixarea.de
Sat Apr 21 08:13:01 UTC 2012
El día Saturday, April 21, 2012 a las 07:34:44AM +0100, Matthew Seaman escribió:
> www/tidy-devel
>
> (which is effectively a fork of the original www/tidy project, and has
> quite a lot of new functionality)
>
> If you specify 'ascii' for the output format, it should generate
> appropriate character escapes.
Thanks; it works fine if one specifies utf8 for input and ascii for
output in a config file .tidy like:
$ cat .tidy
output-xhtml: yes
add-xml-decl: no
doctype: strict
input-encoding: utf8
output-encoding: ascii
indent: auto
wrap: 76
repeated-attributes: keep-last
error-file: errs.txt
Then you can run and get valid ASCII HTML style, for example:
$ echo 'ΜΙΣΟ ΛΙΤΡΟ ΑΘΩΣ ΚΟΚΚΙΝΟ ΠΑΡΑΚΑΛΩ' | tidy -config .tidy
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for FreeBSD (vers 7 December 2008), see www.w3.org" />
<title></title>
</head>
<body>
ΜΙΣΟ ΛΙΤΡΟ
ΑΘΩΣ
ΚΟΚΚΙΝΟ
ΠΑΡΑΚΑΛΩ
</body>
</html>
This is exactly what I was looking for. Thanks
matthias
--
Matthias Apitz
t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211
e <guru at unixarea.de> - w http://www.unixarea.de/
UNIX since V7 on PDP-11 | UNIX on mainframe since ESER 1055 (IBM /370)
UNIX on x86 since SVR4.2 UnixWare 2.1.2 | FreeBSD since 2.2.5
More information about the freebsd-questions
mailing list