Re: text/xml for SOAP is incorrect

So, let's imagine an intermediary that modifies XHTML in-flight (not
pleasant, I know, but bear with me). 

If SOAP and XHTML share application/xml, the intermediary can't use
the content-type to find XHTML messages for processing, which it can
scan for very efficienty. Instead, to behave properly, it has to look
for application/xml, and then parse the XML (perhaps with SAX, so
that they can stream) to figure out what the root namespace is.

The cost of doing this is high, considering that someone writing
XHTML modification code may only be vaguely aware or caring of other
XML applications may cross its doorstep. More to the point, such an
application that does operate correctly (by deriving the namespace)
needs to buffer and parse *every* message with content-type:
application/xml until it determines the namespace in use.

Other configurations (a SOAP intermediary interposed on a HTTP
intermediary, for instance) have similar behaviours; all XHTML
messages will be buffered, to make sure that they're not SOAP. The
more XML formats that use application/XML, the more of a bottleneck
that this has the potential of becoming.

This may seem trivial, but intermediaries are some of the most
performance-sensitive devices out there. Imposing a high processing
cost on a large chunk of traffic in order to identify a small portion
of it isn't appealing to intermediary vendors. For better or worse,
they have a history of creative work-arounds to specified behaviours
that have large performance penalties.

Most of the larger companies represented in the WG have HTTP
intermediary products of some kind, and some have direct interest in
intermediary processing models; I'd encourage discussing this issue
with those teams.

Cheers,



On Wed, Sep 19, 2001 at 05:23:21PM -0400, Mark Baker wrote:
> > I'd reiterate that other W3C XML-based formats have chosen to define
> > their own content-type. Perhaps we should explore the reasoning of
> > those groups (SVG and SMIL, to start with).
> 
> FWIW, XHTML 1.0 was held up for quite a while because of two issues;
> one, the "three namespaces vs. one" debate, and the other, that XHTML
> should not be sent as text/xml or application/xml[1].  The concern
> expressed by Sun and others was that because XML namespaces weren't well
> deployed (though that was in late '99), "img", "h1", and other well known
> HTML elements (or perhaps all of HTML) would somehow find special status
> in a "root namespace" such that they would be usable as-is in other XML
> formats that didn't use namespaces.
> 
>  [1] http://www.w3.org/TR/1999/PR-xhtml1-19990824/#media
> 
> MB

-- 
Mark Nottingham
http://www.mnot.net/
 

Received on Wednesday, 19 September 2001 18:10:19 UTC