Re: First Draft of an MTOM Formulation based on the Query Data Model from noah_mendelsohn@us.ibm.com on 2003-09-02 (xml-dist-app@w3.org from September 2003)

From: <noah_mendelsohn@us.ibm.com>
Date: Tue, 2 Sep 2003 17:18:03 -0400
To: "Ugo Corda" <UCorda@SeeBeyond.com>
Cc: xml-dist-app@w3.org
Message-ID: <OFF0EE043D.E5BDE89D-ON85256D95.00748036@lotus.com>
Ugo Corda writes:

> Noah,
> 
> The more I think about this the more it seems to me
> that all we are doing here is specifying a particular
> instance of Infoset serialization. This is not
> intended, of course, to diminish the value of this
> activity, but to put it in the regular context of
> Infoset serialization. In other words, instead of
> looking at this work as a way of going beyond the
> Infoset by using "a typed superset of the Infoset", we
> could look at it as just a particular type of Infoset
> serialization. In that respect, all we are doing here
> is exactly part of what SOAP normally prescribes,
> i.e. choosing a particular Infoset serialization of the
> SOAP envelope and sending it over the wire.

I can see that view, but this is not just any serialization.  It is one in 
which all the critical choices are based on knowledge of type (or, and 
this is indeed in interesting distinction, recognition that the lexical 
form is compatible with certain types.)
 
> By definition, an Infoset serialization consists of a
> concrete representation of the Infoset (e.g. a
> traditional character-based angle-brackets
> representation, an in-memory DOM-based representation,
> etc.) plus some rules that allow us to map from the
> abstract Infoset to the concrete representation. In our
> case, we choose the MIME Multipart/Related packaging as
> the base for our concrete Infoset representation.

Yes.
 
> We still have to decide how to concretely represent
> Character Information Items. Instead of using the usual
> approach of representing them as characters in a
> string, we want something more compact (at least in
> some cases). Here comes the trick (i.e. the
> serialization rule) we use for achieving this more
> compact representation. The rule is:
 
> - associate a type with particular Infoset string
> values (based on previous Schema validation, or
> whatever else)
> 
> - use that type to binary encode the Infoset string
> value (i.e. if the type is base64 get the corresponding
> binary value, if the type is integer get some binary
> representation of the corresponding integer value,
> etc.)
> 
> At this point we have a complete serialization
> mechanism for our Infoset, which allows us to go from
> abstract Infoset to concrete representation and vice
> versa.
> 
> From this perspective, reference to the XQuery Data
> Model is useful but not strictly necessary. In
> particular, we don't need to think of the SOAP envelope
> Infoset being sent as a typed augmentation of the
> traditional Infoset. The "typing trick" is below the
> cover, part of the Infoset serialization machinery, and
> does not need to appear either in the SOAP envelope
> Infoset being sent or in the SOAP envelope Infoset
> being received.

I agree we don't have to use the Data Model, but I'm suggesting it may be 
desireable.  I think we have a tradition in W3C of trying to build on 
existing specifications and abstractions.  MTOM by its nature requires 
making statements about the type of the nodes to be optimized.  We have an 
emerging W3C Recommendation for how to discuss typed nodes, and that's the 
data model.  We don't have to use it, but we probably should.  In addition 
to avoiding the need to write our own prose describing the lexical/value 
correspondence, we make it easier for others to use MTOM to describe the 
efficient transmission of Query results, or to use MTOM as a building 
block for a future spec that would send the full typed Infoset.
 
> Ugo 

Noah

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------
Received on Tuesday, 2 September 2003 17:17:09 UTC