Re: C.4 Undeclared entities?
Henry Thompson wrote:
> OK, I'll bite. David is merely the last in a moderately long list
> (i.e. at least three people :-) who have asserted without any argument
> that "users won't include a <!DOCTYPE ...>, so we shouldn't require
> one for well-formedness." I have to say I just don't get it -- why
> ever not? They're going to have to do a lot of other, more
> substantial, things differently from what they are used to, if they
> are hope-to-die HTML mavens, who are the only group I can suppose
> David et al. have in mind.
I won't speak for David but I for one would like to use XML without a DTD. I can
imagine doing lot's of interesting (perhaps impromptu) work that would be
difficult if a DTD were required. The proposition that only valid XML documents
be interchanged precludes this type of work.
Requiring a "dummy" DOCTYPE declaration raises the question - Why? I doubt that
8879 conformance will fly as an answer for all XML documents. Let 8879
conformance apply to *valid* XML documents and have somewhat more relaxed rules
for well-formed and other XML documents. If an XML document contains a DOCTYPE,
the receiving application would be expected to locate, obtain, and use the
referenced DTD. Without the DOCTYPE, a generic XML application should assume
well-formed input. Specific applications might assume something less than
well-formed. What's wrong with this picture?
Finally, I'll take a perhaps unpopular position and state that I doubt we have
written the last chapter on document structure. I doubt we ever will. If we
leave XML open-ended, someone might invent something that goes well beyond what
is possible with a DTD while still remaining, even if tenuously, within the XML
framework. To me, that is a very interesting possibility and I see no reason to
discourage such activity. Requiring DTDs, dummy or otherwise, effectively limits
XML's potential by stating that DTDs are the only mechanism by which document
structure can be specified.
> After all, both SGML fans and total
> newbies won't have any problem with following this rule. Why is it
> likely that HTML fans, who after all have at least HEARD of
> <!DOCTYPE ...>, will ignore this requirement but not, say, the
> requirement to provide explicit end tags? Or the requirement to quote
> all attribute values? Seems modest by comparison, and a small price
> to pay for SGML compatibility.
While some HTML fans have heard of <!DOCTYPE ...>, I suspect that only a small
percentage of HTML documents actually contain a DOCTYPE declaration. I've done a
statistically insignificant, not even close to random survey, of some web sites
to see who uses <DOCTYPE ...> on their home page. Here are the results:
Sun no Spyglass yes
JavSoft no Ebt yes
Netscape no Softquad yes
Microsoft no Textuality yes
Ncsa no Passage yes
Adobe no Arbortext yes
Excite no W3c yes
Yahoo no Gca yes
Lycos no Isogen yes
Verity no Fulcrum yes
I'll leave a detailed analysis of the pseudo-results to the individual but at
the highest level, SGML literati use DOCTYPE and others don't.
Explicit end tags will be forgotten or omitted. Attribute values won't be
quoted. Countless "errors" will be found in XML documents, just as they are in
HTML and SGML documents. In at least some applications, these errors will
manifest themselves in obvious ways and authors will take corrective action.
However, the absence of a dummy DOCTYPE in a well-formed XML document probably
won't be obvious in many applications. Why? If my suspicion is correct I doubt
that most developers will put in code to check for a condition that if
encountered can cause no harm. In fact they would normally do something quite
different - eliminate the condition. We can provide that service to all XML
developers simply by stating that DOCTYPE is required only for XML documents
that purport to be valid.
Requiring that all XML documents carry <!DOCTYPE foo SYSTEM> in the name of 8879
conformance seems quite the hack to me. I and others have argued that it will be
ignored by XML users. I doubt that existing SGML systems will be able to do much
more than report an error upon encountering <!DOCTYPE foo SYSTEM>. Of course the
void's entity manager could locate foo and return it but as I've stated before I
have doubts about the void.
So what is the practical purpose of requiring a DOCTYPE declaration in