- From: Dan Connolly <connolly@w3.org>
- Date: Thu, 20 May 1999 12:38:01 -0500
- To: www-xml-infoset-comments@w3.org
The spec says: "There is one processing instruction information item for every processing instruction in the document." -- http://www.w3.org/TR/1999/WD-xml-infoset-19990517#infoitem.pi but I don't see any specification of how to count how many processing instructions there are in an XML document. For example, how many processing instructions are there in the following document? <!DOCTYPE foo [ <!ENTITY piShorthand "<?any pi?>"> ]> <foo>&piShorthand;&piShorthand;&piShorthand;</foo> Strictly following the XML 1.0 grammar, you won't encounter the PI production *at all*. In fact, I can't find anything in the XML 1.0 spec that says the following has more than one element: <!DOCTYPE foo [ <!ENTITY elshorthand "<xxx/>"> ]> <foo>&elShorthand;&elShorthand;&elShorthand;</foo> The section "4.1 Character and Entity References" doesn't say "when you see an entity reference, dereference it and parse the contents inline." It doesn't even have a reference to section 4.4 http://www.w3.org/TR/1998/REC-xml-19980210#entproc There's some stuff in http://www.w3.org/TR/1998/REC-xml-19980210#included An entity is included when its replacement text is retrieved and processed, in place of the reference itself, as though it were part of the document at the location the reference was recognized. whatever that means. More on counting... under "2.1. The Document Information Item" "There is always one document information item in the information set, ..." That makes it sound like there's only one information set in the world, like there's only one set of integers in the world. I suggest "There is always one document information item in the information set of an XML document, ..." Under "2.1.1. Document: Required Properties" "2.An unordered set of notation information items, one for each notation declaration that the XML processor has read." That says that the information set is not just a function of an XML document, but also a function of the behaviour of a processor used to read it. Surely that's not what you meant, right? I suggest: 2. a set of notation information items, one for each notation declaration in the XML document. modulo the question about counting items in the first place. I think you're going to have to talk about the parse tree resulting from using the productions in the XML 1.0 spec (which means that it matters that the grammar is abiguous). And you'll have to figure out how entities really interact with those productions. Another example: "Validating processors are required by XML 1.0 to provide this information; non-validating processors may always set this flag to false. " "set this flag"? The information set just is. It doesn't have state that can be flipped on and off. It's a function of an XML document alone, according to 1. Introduction This document specifies an abstract data set called the XML information set (Infoset), a description of the information available in a well-formed XML document [XML]. Also: why mince words so much? In stead of: The document information item must have the following properties available is some form: why not just: The document information item consists of: under 2.2.2. Elements: Optional Properties 6.A reference to the entity information item for the entity in which this element begins and ends. why "A reference to"? why not just: 6. an entity information item for the entity in which this element begins and ends. if you're worried about identity, don't. You wouldn't say: a reference to the ISO 10646 character code of the character... would you? entity information items are identical or not just like integers are identical or not. (by the way: I think the term is "codepoint" not "character code" c.f. http://www.w3.org/TR/1999/WD-charmod-19990225#CharBytes) -- Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Thursday, 20 May 1999 13:37:58 UTC