Validating docbook articles...
Chuck Swiger
cswiger at mac.com
Mon Feb 23 19:26:32 UTC 2004
Dag-Erling Smørgrav wrote:
> Alex Dupre <ale at FreeBSD.org> writes:
>> [ ...talking about -preserve in tidy... ]
> This reminds me of the many good reasons to convert the doc tree to
> XML. One of these is that xmllint can both validate input files and
> clean up output files, and it does a far better job of it than tidy.
An interesting idea. I took a quick look at converting an existing SGML
document into XML in order to gain some idea as to the work involved.
Given an SGML prologue of:
<!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
<!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
%man;
<!ENTITY % freebsd PUBLIC "-//FreeBSD//ENTITIES DocBook Miscellaneous FreeBSD
Entities//EN">
%freebsd;
<!ENTITY % trademarks PUBLIC "-//FreeBSD//ENTITIES DocBook Trademark
Entities//EN">
%trademarks;
]>
...from doc/en_US.ISO8859-1/articles/filtering-bridges (written by ale@, of
course :-), it's easy to add an XML prologue-- this could be done
automaticly-- and "make lint" works just fine with an XML declaration in
place. So far, so good.
How does one generate proper SystemLiterals per:
|4.2.2 External Entities
|
|[Definition: If the entity is not internal, it is an external entity,
|declared as follows:]
|
|External Entity Declaration
|
|[75] ExternalID ::= 'SYSTEM' S SystemLiteral
| | 'PUBLIC' S PubidLiteral S SystemLiteral
69-sec% xmllint article.sgml
article.sgml:3: parser error : SystemLiteral " or ' expected
<!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
^
article.sgml:3: parser error : SYSTEM or PUBLIC, the URI is missing
<!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
^
article.sgml:4: parser error : Space required after the Public Identifier
<!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
^
article.sgml:4: parser error : SystemLiteral " or ' expected
<!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
^
article.sgml:4: parser error : SYSTEM or PUBLIC, the URI is missing
<!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
^
article.sgml:5: parser warning : PEReference: %man; not found
%man;
^
[ ... ]
Are these entities published via a URI, or does one need to refer to a local
path? Is there a tool to update (normalize?) these ENTITY declarations
automaticly, as using "xmllint --catalogs --loaddtd" didn't seem to help?
Maybe this seems trivial, but there are several hundred SGML source files
which would all need to be updated this way...
--
-Chuck
More information about the freebsd-doc
mailing list