- From: <noah_mendelsohn@us.ibm.com>
- Date: Wed, 3 Mar 2004 09:09:56 -0500
- To: xmlp-comments@w3.org
- Cc: Richard Tobin <richard@cogsci.ed.ac.uk>
SOAP 1.2 specifically depends on the October 24, 2001 version of Infoset[1]. At the time, the latest published version of XML was XML 1.0 Second Edition. As an aside, that version of Infoset has a seemingly misleading reference which says: "XML Extensible Markup Language (XML) 1.0 (Second Edition), W3C, eds. Tim Bray, Jean Paoli, C.M. Sperberg-McQueen, Eve Maler. 6 October 2000. Available at http://www.w3.org/TR/REC-xml. " The supplied link now resolves to the Third edition. Anyway, I had always assumed that the version of Infoset to which SOAP refers would limit character children in synthetic Infosets (typical of SOAP) as well as others to the then-legal XML characters [3]. Richard Tobin was kind enough to point out to me that Infoset in fact has no such limitation on the contents of character children. Since we don't explicitly enforce such a restriction in SOAP either, for example in the body child element[4], we have what I take to be the bizarre situation that SOAP envelope infosets can per our recommendationcontain non-XML characters. It could, for example, contain nulls or the XML-forbidden control characters below x20. I don't believe this was intentional. Also: I think this means that our HTTP binding contradicts the requirements of the binding framework which states that: "the minimum responsibility of a binding in transmitting a message is to specify the means by which the SOAP message infoset is transferred to and reconstituted by the binding at the receiving SOAP node and to specify the manner in which the transmission of the envelope is effected using the facilities of the underlying protocol." So, our specification is self-contradictory. I think this is good news of a sort, as it means we can consider fixing this with an erratum. I believe that an example of a specific fix would be to state in of the body child element [4]: <recommendationText> MAY have any number of character information item children. Child character information items whose character code is amongst the white space characters as defined by XML 1.0 [XML 1.0] are considered significant. </recommendationText> <proposedRevision> MAY have any number of character information item children. >Each such child must have as its [character code] a value which matches the {char} production of XML 1.0 [XML 1.0].< Child character information items whose character code is amongst the white space characters as defined by XML 1.0 [XML 1.0] are considered significant. </proposedRevision> Obviously we should look through all parts of the rec to see if there are other similar slip ups. FWIW: most of our other character children are actually schema typed, and therefore don't have this problem. The mustUnderstand attribute, for example, is a boolean. The schema for SOAP constrains its characters to {true, false, 0, 1}. Noah [1] http://www.w3.org/TR/2001/REC-xml-infoset-20011024/ [2]http://www.w3.org/TR/2000/REC-xml-20001006 [3] http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char [4] http://www.w3.org/TR/soap12-part1/#soapbodyel [5] http://www.w3.org/TR/soap12-part1/#bindfw -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Wednesday, 3 March 2004 09:32:10 UTC