Does/should SOAP define an XML subset?

Restating what I said on the telcon:

For a number of reasons that Noah has outlined in our response to the TAG,
SOAP does not use all the features of XML, and forbids DTDs (and thus things
like entity declarations) and processing instructions from "legal" SOAP
messages.  As  a consequence, people using "generic" XML tools (editors,
APIs, parsers, schema validators, etc.) can build an XML instance that is
well-formed and valid with respect to the SOAP schema and a schema for the
SOAP body, but has XML constructs that are not legal in SOAP. 

Granted, the SOAP 1.1 definers and the XMLP WG did not *intend* to define a
subset of XML, and (as best as anyone can remember) didn't think of this as
any more significant than, for example, not defining any attributes in an
XML application. Also granted, this is not a typical use case -- SOAP
messages tend to be generated by SOAP-specific tools, but it is a bit of a
wart on the overall coherence of the W3C specs.

So, one way forward is to formally define a subset of XML 1.x (or perhaps a
profile of the various XML-related specs that SOAP normatively depends on)
that "blesses" SOAP's de facto practice.  That would encourage tool vendors
to offer features such as a "validate against SOAP profile" option that
would detect the use of DTDs and PIs in XML instances and issue a warning.
This needs to come  from the XML Core WG, who are charged with maintaining
XML, and not the XMLP WG.  So, the issue under discussion is whether to ask
them to do this, and how to phrase the arguments.

While this seems like a demand for the rest of the world to bend over
backwards to accomodate SOAP, in fact the issue is much broader.  In a
nutshell, the same reasons for forbidding DTDs and PIs in SOAP messages
apply in other domains -- DTDs make it difficult to embed XML documents in
other XML documents, since the DTD can only appear at the beginning of a
document, and PIs have been a notorious source of non-interoperability since
SGML days.  Also, approximately half of the XML 1.0 spec is devoted to the
definition of DTD-related syntax, so "XML" implementations in
footprint-limited environment often explicitly or implicitly deprecate DTD
processing.  Thus, there are reasons beyond SOAP to consider defining the
"SOAP subset" as an official profile, or conformance level, or however this
can be fit into XML with the least disruption.

On the other hand, there are good counterarguments ... I believe others on
the call today were tasked with presenting them.

Received on Wednesday, 7 May 2003 15:03:15 UTC