Re: Moving on (was Re: URIs quack like a duck)

At 12:22 PM 6/2/00 -0400, John Cowan wrote:
>"Simon St.Laurent" wrote:
>> Perhaps the Infoset should define what level of XML processing it
>> represents, and provide enough information (like base URI of elements
>> pulled from external entities) for layers above that to do their own work.
>
>It excludes documents that use ":" in names in ways that violate Namespaces.
>So a document like "<FOO::bar>this is bogus</FOO::bar>",
>though it is well-formed XML 1.0, there is no Infoset for it.

As I've noted with the inclusion and defaulting via DTD examples, I'm not
sure that's sufficient.  (Unless of course relative URIs are barred...)

>> Right now, it appears to represent post-DTD parser output with some level
>> of namespace awareness.  It doesn't specify much about processing, however,
>> seeming to think that its representation of document structures is somehow
>> independent of such processing.
>
>We had earlier drafts that knew all about processing, and determined that
>the model was untenable.  Now we have only one Infoset per document,
>representing all its parts, and allow Infoset-compliant programs to
>return any subset of the Infoset they want as long as the subset is
>well documented.

It's a nice Platonic vision of a document, but I have to admit it doesn't
feel (to me) like it has very close contact with the reality of 'what is an
XML document?'

'as long as the subset is well documented'?  What does that mean?  The XML
1.0 exemptions for non-validating parsers are documented in the Last Call
(12/20/99) draft, but I don't see any general exemption.

>> (And non-validating parsers and validating parsers seem capable of
>> legitimately returning different Infoset information from the same document
>> - something acknowledged in 2.5, but which feels pretty odd otherwise.)
>
>They can't return *conflicting* information.  In any case they must
>return a subset of *the* infoset for the document.  (The exception is
>reference-to-skipped-entity information items, which don't appear in
>*the* infoset.)

That last exception is pretty frightening.  Is it justifiable?

It sounds like the Infoset is going to get ripped in half by any solution
to this namespace problem that doesn't outright reject relative URI
references, unless the Infoset starts providing a much finer level of
detail about the origins of various pieces in a document, all specified as
URIs.

Simon St.Laurent
XML Elements of Style / XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Cookies / Sharing Bandwidth
http://www.simonstl.com

Received on Friday, 2 June 2000 12:33:03 UTC