- From: <noah_mendelsohn@us.ibm.com>
- Date: Wed, 27 Apr 2005 16:49:39 -0400
- To: Mark Baker <distobj@acm.org>
- Cc: xml-dist-app@w3.org
Mark, This is very helpful, thank you. I'm tempted to ask why SOAP is any more broken or at risk than XML itself? In the case of XML, we have a media type "application/xml". That tells you to expect perhaps an XML declaration, perhaps an internal subset, and then a root element the nature of which is completely unknown. How are these disambiguated in practice? How do I recognize a purchase order from an invoice? Answer: if the root is namespace qualified you infer from the QName. You might reasonably say that I could also register application/purchaseOrder+xml, and that would indeed add additional out of stream information, but surely the creation of such a media type is optional. There is surely no rule that a new media type is to be registered for each XML root element QName. Now consider SOAP documents, such as those exchanged by the SOAP HTTP binding. The binding serializes the Infoset to a stream of type application/soap+xml, which is a specialization of application xml. The specialization tells you some additional things, such as that the root type will be soap:envelope. Crucially, it also tells you that there will be a body containing an XML element. I fail to see how you know any more or less about the body element than you generally do about the root of an application/xml stream. In both cases, you know that it's an XML element and that you need to recognize the QName to infer its type. SOAP looks to me no more or less broken than XML itself. In both cases, you need to look at the QName to see what's going on. For application/xml, it's the root QName, for application/soap+xml it's the body child element. By the way, if media types were enhanced to get rid of the 1-level + sign kludge, then you might do something like registering: application/purchaseorder+soap+xml, the same option you have at the root level today. Also, I think one can make the case that for each application/*+xml media type there should be a dual for the typing of Infosets. The media type covers the serialized form, and the dual types the corresponding infoset. Whether the duals are given separate names, or whether the use of the media type name is licensed for both I'm not sure I care. I do think that the two types are different, as one is streams and one in general isn't. -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- Mark Baker <distobj@acm.org> 04/27/2005 03:34 PM To: noah_mendelsohn@us.ibm.com cc: xml-dist-app@w3.org Subject: Re: soap:body and media types (fwd) Hi Noah, On Mon, Apr 25, 2005 at 04:09:03PM -0400, noah_mendelsohn@us.ibm.com wrote: > Mark Baker writes: > > > Something Noah Mendelsohn said at the technical > > plenary week about SOAP & media types, made me > > realize that the SOAP envelope currently has a > > problem; that it cannot communicate the media type > > of the document encapsulated within the SOAP body. > > There are some subtleties here, I think, some of which were obliquely > touched upon at the TAG F2F in Boston. As far as I know, media types > apply to octet streams. SOAP envelopes are not in general octet streams, > but are instead Infosets. Fair enough. I've never really bought the whole Infoset thing, but I respect that SOAP is defined in those terms. > The content of the body is an Element Info > Item. Consider, for example, an implementation that uses SOAP for > communication between processes on a single machine. It would be quite > reasonable to have a SOAP implementation that communicates using DOM or > SAX, without ever serializing to an octet stream. Depends what you mean by "octet stream", I guess. I just think of it as the message payload, and, at least by my definition of "message", messages are still exchanged in-process, including in your example. Of course, the information normally conveyed by a media type need not be conveyed explicitly in that message, but may instead be established out of band. For example, imagine that we copy RFC 2616, except we mandate that the Content-Type header must always be text/html. If we then register port 55555 with IANA, and associate it with this new spec, then we know that all messages on port 55555 are declaring that their payloads have text/html semantics. So the media type needn't be a part of every exchange. But in the absence of any out of band information, I think it's needed, otherwise you have a loss of self-description and resulting ambiguity. I'm not sure about the intricacies of EIIs, but if you mean effectively an Infoset with a single EII, then I could well imagine exchange scenarios involving no out of band information and therefore the need a data semantic indication mechanism like a media type. > It is true that SOAP > envelopes as serialized by the normal HTTP binding are octet streams, > typically of media type application/soap+xml. As I recall you are not a > particular fan of protocol independence, Mark, but SOAP has it, and SOAP > envelopes are Infosets. SOAP "has" protocol independence in that it supports multiple underlying protocols, and I'm very supportive of that. FWIW, I'm just not supportive of the kind of protocol independence where developers are isolated from the semantics of underlying application protocols; something SOAP doesn't, and shouldn't, say much about. Hmm, I'm not sure that was relevant to my point, but oh well. > Thus, I think there are at least two questions implicitly raised by your > note: > > 1. Is it appropriate to apply a media type to something other than an > octet stream, e.g. to an element information item? I have considered > raising this as a TAG issue, but it seems to me that it is not in any > case appropriately a decision for the XMLP WG. Ok. > 2. I suspect the answer at the moment is "no", but let's assume for sake > of discussion it's actually "yes": then we can ask whether the subtrees > carried within SOAP bodies in particular should be media typed? Note > that, in part due to limitations of XML itself, these are not in general > XML documents. They cannot have their own XML declarations, internal > subsets, etc. They are XML fragments, or more specifically element info > items. Furthermore, it's not clear to me that there is an obligation to > carry the media type even if there were one. As above, I think there should be an obligation to use a media type when there's no out of band mechanism to accomplish the task. To not do so would make it impossible to distinguish, for example, between a SOAP message carrying an XHTML document, and one carrying a shortform XSLT stylesheet, since they'd be bytewise identical; http://www.markbaker.ca/Talks/2004-media-types-and-compdocs/slide4-0.html > I think the main architectural question is #1. If that is resolved in > favor of typing infoset subtrees, then it would be straightforward to > define a SOAP header that would be usable to carry the type. I think both are important. But even if the answer to #1 was "No", it still seems to me that XML SOAP messages transferred via application protocols which provide no out of band indication of the semantics of the data (e.g. SMTP, HTTP, but not FTP), should use a media type. Hmm, I see I repeated my main point a couple of times. Sorry, that's what I get for writing a message over several sessions! 8-/ Mark. -- Mark Baker. Ottawa, Ontario, CANADA. http://www.markbaker.ca
Received on Wednesday, 27 April 2005 20:49:49 UTC