- From: DuCharme, Robert <Robert.DuCharme@moodys.com>
- Date: Fri, 22 Sep 2000 10:13:47 -0400
- To: "'xsl-list@mulberrytech.com'" <xsl-list@mulberrytech.com>
- Cc: xsl-editors@w3.org
>Given the nature of our present discussion, I suggest the claim in the first >sentence of the Recommendation that XML "is completely described in this >document" can be seen to be a dubious claim, at best. It was true enough in February of 1998, when it was published. The W3C has acknowledged the lack of coordination between the various groups who have since developed add-on technologies. (See below for more on that.) >If, as you suggest, the meaning of "root" depends on the tree representation >of that document why is that not explained in the XML 1.0 Recommendation? >... >Further, if you are correct that the concept of "root" depends on the tree >representation, why do the editors of the Recommendation use the term "root" >in two distinct usages in Section 2 and Section 2.1 of the Recommendation >without adequate explanation? From 2: "Each XML document has both a logical and a physical structure. Physically, the document is composed of units called entities. An entity may refer to other entities to cause their inclusion in the document. A document begins in a 'root' or 'document entity'." That first sentence is the explanation about the two structures. Yes, it's terse, I can only respond to complaints that it's too terse with a crass commercial plug (see http://www.snee.com/bob/xmlann/). The second and third sentence go on about the physical structure. I mentioned in my last post that the physical structure doesn't care about the logical structure except to specify how an entity qualifies as being well-formed; that's what 2.1 is about. Item 2 in the second list ("There is exactly one element...") is talking about logical structure, because it's talking about elements. >The Recommendation goes on to describe the document entity as the root of the >"entity tree". Is there a physical "entity tree"? I think not. It is a >logical relationship. The "document entity" which Section 2 claims is a >"physical structure" is also, so it seems, the root of a logical "entity >tree". Or would you wish to claim that a "physical" entity tree exists? An XML parser must read in a document entity and resolves external entity references by locating and reading in referenced external entities. If document entity A refers to external entities B and C, and B refers to D and E, and C refers to F and G, the parser must open each of those files and read them off the disk into its memory. That's what they mean by physical. If you sketch out the relationship between these entities, that's the physical tree they're talking about. >But the Recommendation later claims the document entity "has no name". In my >file system the document entity, as a "physical structure" does have a name - >"sample.xml", for example. So in what sense does Section 4.8 refer to the >document entity having no name? If I declare an entity like this, <!ENTITY foo SYSTEM "bar.xml"> the entity's name is foo. The filename bar.xml is not the entity's name, but its system identifier. A document entity may have a filename (the more generic term "system identifier" is used because it all still works on operating systems that don't use the concept of a "file") but as an entity, it has no entity name. (And remember, a document entity doesn't have to even have a filename or other system identifier--it might be handed to the XML parser in memory from a database manager or from a perl script via a pipe.) >Section 4.8 is referring to the document >entity in another usage - when it is being **logically** combined with any >other entities I don't see that in 4.8. It says that it's "a starting-point for an XML processor"--it's the A in the A, B, C, D, E, F, G that I described above are physically read in and combined in memory. >Can you see how the Recommendation blurs and confuses the supposed >distinction between "logical structure" and "physical structure"? Nope! >I appreciate that the DOM 1 Recommendation came later. It refers back to XML >1.0 in the definition of "root node" Where does it do that? I couldn't find it in http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html. >I admit I could have missed this but does any W3C document adequately explain >those differences or the practical consequences of them? Should some W3C >document not actually do so? A single document that explains the difference approaches between five different specs would have to be a five-by-five matrix that has a new column and row added every time a new related spec was written, and that would be impractical. What would have been better would be if the XML spec had laid out, in addition to an explanation of a document's logical and physical structure, the details of its "information" structure--exactly what information a processing program could expect of it. This failure (and the XML Working Group can't be blamed too much for it--they set out to design a stripped-down version of SGML that could be shipped over the Web more easily than full SGML, and had no idea of the uses that people would put XML to) meant that the Working Groups for additional XML technologies had to make up and assume certain things, and this let to subtle and not-so-subtle conflicts between those additional specs. The Infoset spec is an attempt to make up for this. These conflicts are also a key reason that so many important specs have been held up in the Candidate Recommendation stage lately. Bob DuCharme www.snee.com/bob <bob@ snee.com> "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii
Received on Friday, 22 September 2000 10:34:37 UTC