Issue 501 is closed again. from Kodichath, Suresh on 2004-10-12 (xmlp-comments@w3.org from October 2004)

From: Kodichath, Suresh <SKODICHA@iona.com>
Date: Tue, 12 Oct 2004 20:09:27 +0000
To: <xmlp-comments@w3.org>, "A.Vine" <andrea.vine@Sun.COM>, <public-i18n-ws@w3.org>
Cc: <w3c-xml-protoco-wg@w3.org>
Message-ID: <244F5835C09CB641AE1D928BB2B0B9D83349AC@amereast-ems2.boston.amer.iona.com>
Dear Andrea and I18nWSTF,
   We received your note [1] which raises a number of questions about our
issue 501 [2].  This note deals primarily with your concern that:

"We also believe that forcing base64 encoding on readable text is a mistake
which will introduce a number of problems, not the least of which is
masking the fact that the inline data needs to be tagged.  We feel it
should be strongly discouraged, if not disallowed."

We think that perhaps the design point of XOP and MTOM may have been
misunderstood.  The intended purpose of XOP and MTOM is to allow binary
octet streams to be carried in XML documents in a manner that allows for
good optimization of storage and networking formats.  Stated differently,
the purpose of XOP and MTOM is to provide for tunnelling of binary data in
an optimized way.    Like most filesystems and other systems that manage
octet streams, we specifically avoid "spying on" the contents of that data
or providing for special behavior according to the contents.  We don't
provide special facilities for the case where the stream happens to be
"image/jpeg" and we don't for "text/*" either.  While it's true that any
system that supports octet streams can be used in the special case where
the content happens to be encoded text, that is not the focus of XOP or
MTOM, and we are reluctant to provide special mechanisms for dealing with
text.   Indeed, where such special handling is desired, XOP and MTOM should
not be used.  We think that users can make that decision according to their
needs.  When XOP and MTOM are used, they should be viewed as a completely
opaque tunnel;  the data should be treated as characters only before it is
encoded or after it has been "extracted" from its encapsulated form.

You ask:  "why base64Binary" for text input streams?  As discussed above,
we strongly believe that we should not treat text differently from other
octet streams.  To reiterate why we use base64Binary for all such streams:
XOP and MTOM do their jobs by establishing a correspondence between binary
data stored in its "native" form as a "part" in MIME, and a corresponing
character representation in an Infoset.  For this, base64Binary is provides
a natural and standardized character representation.  It is a byproduct of
this design that if a user choosesto tunnel character data as if it were
binary, the representation will indeed seem somewhat more appropriate to
binary content than to text.  In situations where this is not the desired
behavior,  XOP and MTOM should not be used.

We note as an aside that if someone did decide to use MTOM with, say, an
XHTML document encoded in BIG5, and if you looked at the corresponding XOP
MIME part, you would find exactly the BIG5 stream not the base64Binary
characters;  the base64Binary characters are an artifact defined by the
specification, to be used only in the case where the application
specifically needs a view of the data in the Infoset.  For example, one
might compute a digital signature for the entire containing infoset,
including the base64 characters corresponding to the nested document.  It
is anticipated that realistic implementations will not in fact surface the
base64 character form on the wire, in memory, or through APIs unless
requested for some such purpose.  So, there is emphasis in practice on
dealing directly with the BIG5, in this example, as opposed to the
base64Binary encoding of the BIG5.

A related issue about which you ask is the means by which metadata about
the nested or tunneled octet stream can be conveyed.  This is
architecturally orthogonal to XOP and MTOM in our design.  For example, we
recommend the use of the xmime:content-type attribute [3] with
base64-encoded octet streams in any case where they correspond to
MIME-typed documents, and regardless of whether such content is to be
optimized with XOP or conveyed in a normal XML 1.0 or XML 1.1 character
stream.  Conversely, other forms of description could be used without
changing MTOM or XOP.   We have taken the trouble to define the one
attribute, xmime:content-type that we feel will be of particularly general
utility;  we invite you and other members of the XML community to define
additional such attributes that may be necessary for purposes such as i18n.
We also note that the working draft at [3] says of the xmime:content-type
attribute:

"The [normalized value] of the contentType attribute information item MUST
be the name of a IANA media type token, e.g., "image/png", "text/xml;
charset=utf-16""

which specifically illustrates the use of charset.   We generally decline
to provide normative information in two places;  in this case, we think
that it's appropriate that use of the charset specification is indeed
documented with the normative recommendation for the xmime:content-type
attribute and not in xop or mtom themselves.  

We hope that this note clarifies the reasons for the decisions we have
made.

Regards,
Suresh 
On behalf of Noah Mendelsohn 
XMLP Working Group.

[1] http://lists.w3.org/Archives/Public/xmlp-comments/2004Sep/0018.html
[2] http://www.w3.org/2000/xp/Group/xmlp-cr-issues.html#x501
[3] http://www.w3.org/TR/2004/WD-xml-media-types-20040608/#contentType
Received on Tuesday, 12 October 2004 21:20:12 UTC