- From: Kodichath, Suresh <SKODICHA@iona.com>
- Date: Tue, 12 Oct 2004 16:09:23 -0400
- To: <xmlp-comments@w3.org>, "A.Vine" <andrea.vine@Sun.COM>, <public-i18n-ws@w3.org>
- Cc: <w3c-xml-protoco-wg@w3.org>
- Message-ID: <244F5835C09CB641AE1D928BB2B0B9D83349AC@amereast-ems2.boston.amer.iona.com>
Dear Andrea and I18nWSTF, We received your note [1] which raises a number of questions about our issue 501 [2]. This note deals primarily with your concern that: "We also believe that forcing base64 encoding on readable text is a mistake which will introduce a number of problems, not the least of which is masking the fact that the inline data needs to be tagged. We feel it should be strongly discouraged, if not disallowed." We think that perhaps the design point of XOP and MTOM may have been misunderstood. The intended purpose of XOP and MTOM is to allow binary octet streams to be carried in XML documents in a manner that allows for good optimization of storage and networking formats. Stated differently, the purpose of XOP and MTOM is to provide for tunnelling of binary data in an optimized way. Like most filesystems and other systems that manage octet streams, we specifically avoid "spying on" the contents of that data or providing for special behavior according to the contents. We don't provide special facilities for the case where the stream happens to be "image/jpeg" and we don't for "text/*" either. While it's true that any system that supports octet streams can be used in the special case where the content happens to be encoded text, that is not the focus of XOP or MTOM, and we are reluctant to provide special mechanisms for dealing with text. Indeed, where such special handling is desired, XOP and MTOM should not be used. We think that users can make that decision according to their needs. When XOP and MTOM are used, they should be viewed as a completely opaque tunnel; the data should be treated as characters only before it is encoded or after it has been "extracted" from its encapsulated form. You ask: "why base64Binary" for text input streams? As discussed above, we strongly believe that we should not treat text differently from other octet streams. To reiterate why we use base64Binary for all such streams: XOP and MTOM do their jobs by establishing a correspondence between binary data stored in its "native" form as a "part" in MIME, and a corresponing character representation in an Infoset. For this, base64Binary is provides a natural and standardized character representation. It is a byproduct of this design that if a user choosesto tunnel character data as if it were binary, the representation will indeed seem somewhat more appropriate to binary content than to text. In situations where this is not the desired behavior, XOP and MTOM should not be used. We note as an aside that if someone did decide to use MTOM with, say, an XHTML document encoded in BIG5, and if you looked at the corresponding XOP MIME part, you would find exactly the BIG5 stream not the base64Binary characters; the base64Binary characters are an artifact defined by the specification, to be used only in the case where the application specifically needs a view of the data in the Infoset. For example, one might compute a digital signature for the entire containing infoset, including the base64 characters corresponding to the nested document. It is anticipated that realistic implementations will not in fact surface the base64 character form on the wire, in memory, or through APIs unless requested for some such purpose. So, there is emphasis in practice on dealing directly with the BIG5, in this example, as opposed to the base64Binary encoding of the BIG5. A related issue about which you ask is the means by which metadata about the nested or tunneled octet stream can be conveyed. This is architecturally orthogonal to XOP and MTOM in our design. For example, we recommend the use of the xmime:content-type attribute [3] with base64-encoded octet streams in any case where they correspond to MIME-typed documents, and regardless of whether such content is to be optimized with XOP or conveyed in a normal XML 1.0 or XML 1.1 character stream. Conversely, other forms of description could be used without changing MTOM or XOP. We have taken the trouble to define the one attribute, xmime:content-type that we feel will be of particularly general utility; we invite you and other members of the XML community to define additional such attributes that may be necessary for purposes such as i18n. We also note that the working draft at [3] says of the xmime:content-type attribute: "The [normalized value] of the contentType attribute information item MUST be the name of a IANA media type token, e.g., "image/png", "text/xml; charset=utf-16"" which specifically illustrates the use of charset. We generally decline to provide normative information in two places; in this case, we think that it's appropriate that use of the charset specification is indeed documented with the normative recommendation for the xmime:content-type attribute and not in xop or mtom themselves. We hope that this note clarifies the reasons for the decisions we have made. Regards, Suresh On behalf of Noah Mendelsohn XMLP Working Group. [1] http://lists.w3.org/Archives/Public/xmlp-comments/2004Sep/0018.html [2] http://www.w3.org/2000/xp/Group/xmlp-cr-issues.html#x501 [3] http://www.w3.org/TR/2004/WD-xml-media-types-20040608/#contentType
Received on Tuesday, 12 October 2004 20:10:49 UTC