Reviewing sgml-lib

I've taken a first look at this, by going through sgml.soc and xml.soc
and checking for consistency and omissions.  Next stage is to re-generate
everything from original sources and diff it to what we have.

Question: what's the rationale for the directory naming here?
e.g. "REC-html401-19991224" rather than "html401" or just "html" ?
This is surely a working SGML catalogue, not a historical record!



sgml.soc
========

Serious-looking:

HTML 3.0 is missing - is that intentional?
"-//IETF//DTD HTML 3.0//EN"

Missing declaration:
"-//W3C//ENTITIES Latin1//EN//HTML"
This one is referenced in both HTML40 and HTML401, so missing it
looks rather serious.  It exists under other aliases, but this is
how it's referenced in the DTDs.


Trivial:

"ISO 8879-1986//ENTITIES Added Latin 1//EN//HTML"
"ISO 8879-1986//ENTITIES Added Latin 1//EN"
are in directory IETF, not ISO.  Is that intentional?

(I have no comparison there - valet puts them all in directory html)


The following aliases for HTML 2 DTD exist in Valet but not W3C.
Are they all considered dead?

"-//IETF//DTD HTML Level 1//EN"
"-//IETF//DTD HTML Strict Level 1//EN"
"-//IETF//DTD HTML Strict//EN"
"-//IETF//DTD HTML i18n//EN"


Entities:
"-//W3C//ENTITIES Full Latin 1//EN//HTML"
"-//W3C//ENTITIES Symbolic//EN//HTML"
are commented as under ISO-HTML


xml.soc
=======

Hmmm, we have (only) flat versions of XHTML11, XHTML Basic, SVG11 and
MathML20.  Is this really sufficient, or should I propose the modular
versions of those four?

We're also missing
"-//W3C//DTD XHTML Architecture 1.1//EN"



INCONSISTENCIES
===============

There are three directories not referenced anywhere (duplicates):
REC-html40-971218
REC-MathML2-20010221
REC-xhtml1-20000126

There is also one missing directory referenced in xml.soc
PR-smil20-20010605


-- 
Nick Kew

Received on Monday, 18 October 2004 17:54:03 UTC