- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 12 Feb 2008 14:16:46 -0500
- To: xmlp-comments@w3.org
- Cc: chrisfer@us.ibm.com, public-xml-core-wg@w3.org
The XML Core working group has published a Proposed Edited Recommendation (PER) Extensible Markup Language (XML) 1.0 (Fifth Edition). The major change in that edition is the proposal to expand the set of legal XML element and attribute names. Without commenting either for myself or for IBM on the merits of this proposal, I note that there appears to be an interdependency with the SOAP 1.2 Recommendation. Specifically, the way that SOAP 1.2 guarantees that all nodes agree on what's legal and what's not in a SOAP envlope is by reference to XML 1.0 serialization rules. From SOAP 1.2 Part 1 Chapter 5 "Message Construct" [2]: "A SOAP message is specified as an XML infoset whose comment, element, attribute, namespace and character information items are able to be serialized as XML 1.0. Note, requiring that the specified information items in SOAP message infosets be serializable as XML 1.0 does NOT require that they be serialized using XML 1.0. [...] The Infoset Recommendation [XML InfoSet] allows for content not directly serializable using XML; for example, the character #x0 is not prohibited in the Infoset, but is disallowed in XML. The XML Infoset of a SOAP Message MUST correspond to an XML 1.0 serialization [XML 1.0]." In other words, all SOAP nodes must follow the same rules for what's a legal envelope, and those rules depend heavily on the well-formedness rules for XML 1.0. Hop by hop, some bindings will actually use the obvious XML 1.0 serialization while others may use compressed, encrypted, etc. alternatives, but either way there must be nothing in the envelope infoset that could not be sent using XML 1.0. But which edition of XML 1.0? The last reference in that paragraph is a hyperlink to the bibliography. I think most readers would taking that as applying to the first sentence, but it's a bit unclear. Anyway, it gets a bit worse. When you follow the hyperlink to the bibliography you get [3]: "[XML 1.0] Extensible Markup Language (XML) 1.0 (Fourth Edition), Jean Paoli, Eve Maler, Tim Bray, et. al., Editors. World Wide Web Consortium, 16 August 2006. This version is http://www.w3.org/TR/2006/REC-xml-20060816. The latest version is available at http://www.w3.org/TR/REC-xml." So, SOAP 1.2 explicitly references XML 1.0 4th edition, but then it also tells you to go looking for a new one too! If you believe it's 4th edition only, then the new XML 1.0 PER has no impact, except insofar as you might sometime decide to update the Recommendation to explicitly point to 5th, should that be your wish (that will, of course, raise some interoperability concerns, since for the first time SOAP nodes won't all agree on what's legal.) Conversely, if one believes the bit about the "latest version", then one can read the SOAP Recommenation as requiring support for the new characters as soon as http://www.w3.org/TR/REC-xml is updated to point to 5th edition. For those reasons, I request that the XML Protocols WG: 1) Figure out what SOAP behavior is desired should it come to pass that XML 1.0 5th edition comes out as planned. In particular, is it the case that conforming nodes MAY, MUST, SHOULD, SHOULD NOT, or MUST NOT accept the new characters in tag names in SOAP envelopes. I believe it's clear that as long as 4th edition is current, the answer is MUST NOT. Does that change if XML 1.0 5th edition reaches Recommendation? 2) Coordinate with the Core WG to ensure that publications are properly synchronized (or instead, if appropriate, provide feedback that XML 1.0 5th edition is a problem for SOAP and should not be published, if that is what you believe.) 3) Consider a bit the impact bindings, faults and errors, should you decide to allow for the new content. Presumably, some nodes will be trying to send new content, perhaps to old nodes that aren't expecting it. Maybe or maybe not the outbound end of the binding implementation notices. Is that a binding-level error or something else? Is there a standard SOAP fault to be defined to indicate that the wrong edition of XML has been used. Maybe the outbound binding implementation is happy with the new chars, but the receiving node is old. If an XML 1.0 serialization is being used, then by far the most likely failure mode is just that the receiving binding (if it's checking well formedness and not trusting the sender), will reject the message as not well formed. I'm not sure if there are more subtle issues with bindings that use non-XML 1.0 forms on the wire. 4) In any case, I suggest you clarify the ambiguity as to whether the text at [2] and [3] is to be read as referring to the latest Recommendation-level edition of XML 1.0, or else as being to specifically 4th edition. Thank you. Noah P.S. In case some of those on the cc: list are not aware, I have not been a member of the Protocols WG for some time. I am just commenting as an interested member of the W3C community. [1] http://www.w3.org/TR/2008/PER-xml-20080205/ [2] http://www.w3.org/TR/soap12-part1/#soapenv [3] http://www.w3.org/TR/soap12-part1/#XML -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Tuesday, 12 February 2008 19:16:31 UTC