Re: Proposal for multi-reference support in MTOM

Marc,

I, too, finally got to read the actual proposal. 

Personally, I don't think we should encourage the creation of XML
formats that duplicate large parts of data in a significant portion of
their usage.

I believe formats where big data may be used in multiple places will
already have a reference mechanism. Your usual scenario is the
serialization of Java data, for example in JAX-RPC. But since JAX-RPC
will want to serialize multiple references to any Java object as such
(multiple references to one chunk of data), it will have to use some
referencing mechanism akin to SOAP 1.1 Section 5 Encoding. This
mechanism would naturally apply to large binary objects, too.

What you're proposing is that your application know about a SOAP binding
optimization and rely on it. I suggest that you extract the relevant
part of the optimization - called referencing - and incorporate it into
your application. If you're looking for a standard in this area, there's
XML Linking [1]. 8-) I don't think the Representation header is
necessarily what you want.

I see how the referencing part could be folded into the binding layer,
possibly eliminating some cases of references to other references on the
wire, but in doing so you limit your application to one optimization
technique and mandate that optimization (a mandatory optimization sounds
totally wrong). I think we should separate the concerns here into
different layers.

Hope it helps,

                   Jacek Kopecky

                   Systinet Corporation
                   http://www.systinet.com/

[1] http://www.w3.org/XML/Linking



On Wed, 2003-11-12 at 21:47, Marc Hadley wrote:
> Here's a proposal for an extension to the current MTOM formulation to 
> offer better support for multiple inclusion of the same data. The 
> proposed extension  has the following properties:
> 
> - Preserves MTOM semantics of attachment inclusion in SOAP message 
> infoset
> - Supports existing 'Include' and 'Representation' semantics, use of 
> extension is optional
> - Supports multiple inclusion of attachments without replication of 
> data in the serialized form
> - Multiply included data replicated in message infoset, signatures over 
> elements containing such data include attachment data rather than a 
> reference to the data as woud be the case when using a Representation 
> approach.
> 
> 
> Infoset Form
> ============
> 
> This section shows via an example the infoset of a message after the 
> binding has performed the MTOM deserialization (described later). XML 
> 1.0 is used as the most convenient syntax to express the infoset but 
> this should be considered a purely abtract model of the message 
> content.
> 
> <env:Envelope xmlns:env="..." xmlns:mtom="...">
>    <env:Body>
>      <app:Stuff xmlns:app="...">
>        <app:Thing1 mtom:ContentID="someURI">
>          some base64 text
>        </app:Thing1>
>        <app:Thing2 mtom:ContentID="someURI">
>          some base64 text
>        </app:Thing2>
>        <app:Thing3>
>          some base64 text
>        </app:Thing3>
>      </app:Stuff>
>    </env:Body>
> </env:Envelope>
> 
> Note that the same base64 data is included as the content of the Thing1 
> and Thing2 EIIs, this is indicated by the value of the mtom:ContentID 
> attribute being the same for both. Thing3 has no mtom:ContentID 
> indicating that the optional multi-reference extension is not being 
> used for the content of this EII.
> 
> 
> Optimized (MIME) Wire Form
> ==========================
> 
> This section shows via an example the serialized form of a message 
> using the MIME based MTOM.
> 
> Content-type: multipart/related; boundary="someBoundaryString"
> 
> --someBoundaryString
> Content-Type: application/soap+xml
> 
> <env:Envelope xmlns:env="..." xmlns:mtom="...">
>    <env:Body>
>      <app:Stuff xmlns:app="...">
>        <app:Thing1 mtom:ContentID="someURI">
>          <mtom:Include href="someURI">
>          <!-- depending on how mtom:ContentID is defined, the 
> Include/@href may be redundant -->
>        </app:Thing1>
>        <app:Thing2 mtom:ContentID="someURI">
>          <mtom:Include href="someURI">
>        </app:Thing2>
>        <app:Thing3>
>          <mtom:Include href="someOtherURI">
>        </app:Thing3>
>      </app:Stuff>
>    </env:Body>
> </env:Envelope>
> 
> --someBoundaryString
> Content-Type: image/png
> Content-ID: someURI
> 
> binary picture data
> 
> --someBoundaryString
> Content-Type: image/png
> Content-ID: someOtherURI
> 
> binary picture data
> 
> --someBoundaryString--
> 
> 
> Schema Types
> ============
> 
> <complexType name="OptimizationCandidate">
>    <simpleContent>
>      <extension base="xsd:base64Binary">
>        <attribute name="ContentID" type="xsd:anyURI"/>
>        <attribute name="MediaType" type="xsd:string"/>
>        <!-- other attributes we define -->
>      </extension>
>    </simpleContent>
> </complexType>
> 
> Terminology
> ===========
> 
> The following terminology is used in the description of the 
> serialization and deserialization algorithms:
> 
> Optimization candidate:
>    EII of type xsd:base64 or mtom:OptimizationCandidate.
> 
> Matching MIME part:
>    MIME part whose content-id and/or content-location headers (TBD 
> specify exact matching criteria) match an 
> OptimizationCandidate/@ContentID.
> 
> Content:
>    base64Binary child CIIs of an optimization candidate (excludes AII 
> children)
> 
> 
> Infoset to Wire Serialization
> =============================
> 
> For each optimization candidate in the SOAP message
>      - if no matching MIME part exists then create a matching MIME part 
> from the optimization candidate's decoded content and AIIs
>      - replace the content of the optimization candidate with a child 
> mtom:Include EII
> 
> 
> Wire to Infoset Deserialization
> ===============================
> 
> For each mtom:Include EII
>      - replace the mtom:Include EII with base64 encoded attachment 
> content
> 
> --
> Marc Hadley <marc.hadley@sun.com>
> Web Technologies and Standards, Sun Microsystems.

Received on Friday, 28 November 2003 11:00:52 UTC