W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > October 1996

Re: C.4 Undeclared entities?

From: Bill Smith <bill.smith@Eng.Sun.COM>
Date: Mon, 28 Oct 1996 17:25:17 -0800 (PST)
To: w3c-sgml-wg@w3.org
Message-ID: <libSDtMail.9610281725.13944.bsmith@providence>
Charles Goldfarb wrote:

> If we took your point literally, Bill, it would require an XML processor to
> accept any character string and make sense of it. There has to be some
> definition of "input that an XML processor is expected to handle".

In my opinion, a good XML processor would do just that. Other XML processors 
might choose some behavior just short of aborting. This is considerably less 
desirable in my opinion. But these decisions should be left to application 
developers and the market.

I suspect that the XML standard, when written, will specify what an XML 
processor is *required* to handle. If we state that exchange of XML documents is 
limited to those that are valid in ISO 8879 terms, we are effectively stating 
that XML processors are required to handle only valid documents. We can and 
should be more flexible than that.

> Tim has made the distinction between "valid", meaning "conforming to an
> explicit
> DTD", and "well-formed", meaning "processable without reference to an explicit
> DTD". An interchanged XML document has to be one or the other, not just an
> arbitrary string.

Tim made an excellent distinction and it has been most helpful during our 
discussions. However, I've taken Tim's "well-formed" notion to mean without 
reference to *any* DTD - implied or otherwise. Of course, Tim may have meant 
something else but this was my interpretation.

If we insist on a classification system for XML document interchange, we had 
better include a category for those that are neither valid, nor well-formed. 
They will surely exist, they will be interchanged, and we should have a name for 
them. I suggest that we label all of these documents as XML. A subset will be 
well-formed. A further subset will be valid. All can be interchanged.

Some XML applications might accept only valid documents. Others might accept 
only well-formed documents. Yet others might accept any document purporting to 
be XML - an arbitrary string. Some of these applications will be successful, 
others will not. Let the market decide.

> For "valid" documents, there would be a DOCTYPE declaration with an explicit
> external and/or internal subset. Such a document would clearly conform to SGML
> as well.


> As Eliot has pointed out, merely "well-formed" XML documents could announce
> themselves and also be SGML conforming if they were introduced by the
> following
> declaration:
> <!DOCTYPE doctypename SYSTEM>

I haven't decided on this but at present I don't like it. The conformance 
argument alone isn't enough to sway me. I also don't see how we distinguish 
between an XML document that requires a DTD and one that doesn't - if both are 
required to have a DTD declaration.
Received on Monday, 28 October 1996 20:25:34 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:25:04 UTC