[PATCH] docproj port needs to use tidy-devel
Gábor Kövesdán
gabor at FreeBSD.org
Sat Jan 26 22:44:44 UTC 2008
Murray Stokely escribió:
> Is there any reason not to update the docproj port to use tidy-devel rather
> than tidy? The released version of tidy is nearly 8 years old and produces
> xhtml that doesn't validate. The newer -devel releases produce more correct
> xhtml.
>
First, sorry for the late answer. Not just the xhtml, but the html
output of tidy is incorrect as well, it does not validate. (I think
www/63552 is related, because without tidy, such errors don't appear.)
But, the newer tidy versions completely mess up character sets. They
mess the Hungarian characters set surely, but I suspect there are
others, too. The only reason that we don't disable it in the Hungarian
project is that builder has an ancient version, which works fine.
Besides, different versions of tidy have different set of command line
options, which makes our toolchain less portable.
But anyway, why we do really need tidy? I made some tests before without
tidy and the only thing that I had to do for generating valid pages was
to reinplace-edit the DTD. As sgmlnorm outputs our custom DTD, the
webpages were not valid, but after replacing them with HTML 4.1
Transitional DTD, everything validated. I'd prefer see it go away.
Yes, I know that one reason for tidy is the indenting and line breaking
in HTML code, the output of sgmlnorm is not for human consumption. But
cannot we do that in a simpler way?
One more idea, which came to my mind about this. Currently, our webpages
are not uniform. We use HTML 4.1 for our pages generated from .sgml and
XHTML 1.1 for .xsl output. What do you think about using XHTML 1.1
uniformly? Obviously, sgmlnorm cannot do that, but there are advantages
in using XML-based technologies. Well, I'm just an enthusiastic newbie
about XML, but I think it would make the data-sharing between our pages
easier. Plus, we can make our infrastructure more simple as we would
only need the XML tools for building webpages and one DTD, no more
conditional cases in .ent files, like this one in header.ent:
<![ %xml.features; [
<!ENTITY header1.meta '
<meta http-equiv="Content-Type" content="text/html;
charset=&xml.encoding;" />
<meta name="MSSmartTagsPreventParsing" content="TRUE" />
'>
]]>
<!ENTITY header1.meta '
<meta http-equiv="Content-Type" content="text/html;
charset=&xml.encoding;">
<meta name="MSSmartTagsPreventParsing" content="TRUE">
'>
Also, XHTML is easier to validate, more strict yet not more difficult to
edit. It is also supposed to obsolete HTML, (yet with the draft of HTML5
it is not that sure any more, but this has nothing to do with the topic
and its advantages) and it is a newer standard to conform to.
As a result, I think it would be a good idea. Maybe it would be a good
SoC project for me to polish the pages in this way as I'm interested, I
want to learn more XML stuff and I want to participate in the upcoming
SoC again. Another item would be to bring the doc repo to DocBook5 / XML.
If this whole stuff about XML had been discussed before, forgive me
please, I missed that.
Regards,
--
Gabor Kovesdan
EMAIL: gabor at FreeBSD.org
WWW: http://www.kovesdan.org
More information about the freebsd-doc
mailing list