- From: <noah_mendelsohn@us.ibm.com>
- Date: Wed, 2 Apr 2003 09:58:59 -0500
- To: robin.berjon@expway.fr
- Cc: tbray@textuality.com, www-tag@w3.org
Robin Berjon writes: > Imho it only looks, walks, and quacks like a subset if > sending some of the excluded tokens generates an error, > ie if general-purpose XML has a chance of blowing up > when it reaches the other side. Just to set the record straight on SOAP: no correct implementation of the SOAP HTTP binding will ever send a PI, DTD etc. This is because the original message is modeled as an infoset and such SOAP infosets by definition do not contain PIs, DTDs, etc. (just as they by definition don't contain zoo:animail attributes on the envelope element). At the sending end: it's assumed that your software allows you to faithfully send such an Infoset. I suspect this is one source of Tim Bray's concern: one can certainly imagine middleware software that would take the liberty to stick in DTDs or PIs that were in some sense not specifically suggested by a sending application. I don't see the XML recommendation as weighing on such software one way or the other. Such software would indeed be inappropriate at a SOAP sender: you need software that lets you prepare an infoset and serialize it as XML 1.0. If that means we've defined a subset, I suppose we have, but I'm not at this point convinced. At the receiving end: unlike XMPP, SOAP considers PIs and DTDs as errors, because they are prima facie evidence that you are talking to a buggy sender. Again, SOAP is silent on how you build the software to detect such error2. You can use a general purpose parser and put above it a layer that checks for PIs and DTDs (and zoo:animal attributes), or you can build a special-purpose SOAP scanner. In the former case, you must use a parser that accurately reflects (at least) the received Infoset and also the presence of any DTD information that is not reflected in the Infoset. These seem to be no more rigorous than the requirements for a parser used in an XML editor....indeed, some of those must accurately reflect single and double quotes, and other serialization details. Again, I suspect Tim's preference would be that the presence of DTDs, PIs, etc. be viewed as details that need not in all cases be reflected by a parser...as with an editor, SOAP is an application of XML for which such a parser would be inappropriate. > On the other hand if it is defined so that the > receiving end MUST parse the XML correctly, but MUST > ignore it (ie MUST NOT pass it on to the application so > that no semantic value whatsoever can ever be attached > to those tokens) then we have a usage convention. It > reads general-purpose XML, it just doesn't extract the > same information out of it. Given that we have no data > model, a parser that exposes less data than another is > not a subset parser. Again, SOAP is different in this respect, for the reasons described above. All of that said, I see nothing that would break if we switched to the XMPP receiver rules, and quietly flushed buggy input, thereby defining it as meaningless but not erroneous. We could also go with a SHOULD fault or MAY fault, which would allow discretion to detect it as an error. I respect and understand the reasons that Tim believes we have, however unintentionally, defined a subset of XML. To some degree it's a matter of terminology. Not speaking officially for the WG, I would reiterate that I don't think we thought we were doing a subset. I don't think we ever asked: should others use this same subset? As I suspect is the case with XMPP, we just used XML in a way that seemed appropriate to our needs. BTW: I think I've now made clear my understanding of what SOAP has done and how it compares to XMPP. In the interest of avoiding list overload, I tentatively plan to remain quiet on this thread for the forseeable future, unless new information shows up or specific questions are raised for which I might have the answer. Thank you! ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------
Received on Wednesday, 2 April 2003 10:06:36 UTC