- From: <noah_mendelsohn@us.ibm.com>
- Date: Thu, 5 Dec 2002 18:16:24 -0500
- To: "Martin Gudgin" <mgudgin@microsoft.com>
- Cc: henrikn@microsoft.com, xml-dist-app@w3.org
Right, that's the question. The quote from section 2.8 is: ------ "If the XML document has a document type declaration, then the information set contains a single document type declaration information item. Note that entities and notations are provided as properties of the document information item, not the document type declaration information item. A document type declaration information item has the following properties: * [system identifier] The system identifier of the external subset, as it appears in the DOCTYPE declaration, without any additional URI escaping applied by the processor. If there is no external subset this property has no value. * [public identifier] The public identifier of the external subset, normalized as described in 4.2.2 External Entities [XML]. If there is no external subset or if it has no public identifier, this property has no value. * [children] An ordered list of processing instruction information items representing processing instructions appearing in the DTD, in the original document order. Items from the internal DTD subset appear before those in the external subset. * [parent] The document information item." --- So, in the case of an internal subset that defines entities, and nothing else, we would get a mandatory document type declaration info item, but with no values for any properties but [parent]? Seems a bit strange to me, but I agree it could be read that way. If so, I suppose there is no issue. I think what's making me nervous is that I can't find an info item anywhere for parsed entities. That's what leads me to feel that you can't quite tell from the Infoset whether they are there or not, and therefore whether a serialization might not include them after all. I read you to say: right, you can't tell much about the entities, element declarations, etc., but the absence of any document type declaration info item does let you infer that there were none. As I say, these seems strange, since the whole drift of the infoset design seems to be to not tell you whether they were there. On the other hand, if everyone reads it that way, I suppose I can go along. On balance, I would prefer the clarification, if only in a note. If we've had to do this level of reasoning to prove that <!DOCTYPE > can't go in the serialization, I fear that others may not see it that way either. Related question that I raised before: I think we're all agreed that we intend receipt of <!DOCTYPE to result in an env:SENDER error. Where do we say whether that error is a MAY/MUST/SHOULD. I think it should be a MUST fault, as the message received is incoherent and known to be buggy. Where do we indicate that the error MUST be generated? Thanks. ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------ "Martin Gudgin" <mgudgin@microsoft.com> 12/05/02 05:07 PM To: <noah_mendelsohn@us.ibm.com> cc: "Henrik Frystyk Nielsen" <henrikn@microsoft.com>, <xml-dist-app@w3.org> Subject: RE: Closing XML Protocol Last Call issue 395 Categories: So the question really is this: Does <!DOCTYPE soap:Envelope [ <!-- entitity decls here --> ]> ( i.e. JUST an internal subset ) result in a a Document Type Declaration Information Item appearing at the infoset level? My reading of Section 2.8 the infoset spec says Yes. Gudge > -----Original Message----- > From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com] > Sent: 05 December 2002 11:44 > To: Martin Gudgin > Cc: Henrik Frystyk Nielsen; xml-dist-app@w3.org > Subject: RE: Closing XML Protocol Last Call issue 395 > > > Gudge writes: > > > I'm not sure why this issues revolves around the > > internal subset. We explicitly prohibit the Document > > Type Declaration Information Item from appearing. > > So far, so good. We agree. > > >> If there is no DTD then there is no internal > >> or external subset. > > Let's be a little careful. Our infosets are synthetic. They > come before > the lexical form is even considered. Clearly we disallow the > info item. > What this means for any possible serialization in any > possible binding is > unclear. > > > Lexically one cannot have <!DOCTYPE ... in a SOAP message. > > Now we're talking about something binding specific. Assume > we're talking > about >the< SOAP HTTP binding. > > >> The only parts of the DTD that are reflected > >> in the infoset are unparsed entities, notations > >> and PIs appearing the in DTD. > > Right, so if I had a lexical form with an internal subset declaring a > parsed entity, then that would not show up in the Infoset > when I parsed > the document. I couldn't tell that there had been an > internal or external > subset. > > Now, go the other way. We say in the HTTP binding that we want > (indirectly through RFC 3203) the XML 1.x serialization of > the infoset. > But if what I say in the para above is right (and I'm not > sure about it), > that's ambiguous. There are at lexical forms with internal > subset that > correspond to the Infoset that has no DTD information item. > That is the > source of my concern. If there is even a hint of this > ambiguity, I think > our binding (or the RFC if appropriate) needs to say explicitly: > "<!DOCTYPE ... > MUST NOT appear." > > I feel like I may be confused, but in the meantime, I remain > concerned > that there is an ambiguity. If someone sent an instance with > internal > subset, but that parsed into an Infoset with no Doctype Info > Item, I'd not > sure where I'd point in the spec to say "you broke the > rules." What am I > missing? Thanks. > > ------------------------------------------------------------------ > Noah Mendelsohn Voice: 1-617-693-4036 > IBM Corporation Fax: 1-617-693-8676 > One Rogers Street > Cambridge, MA 02142 > ------------------------------------------------------------------ > > > >
Received on Thursday, 5 December 2002 18:18:35 UTC