Re: Propsed new issue: variability of encoding in Miffy

Reviewing this thread, and remembering our discussion on the phone last 
week, I think I see two proposals being discussed:

* [Mark Nottingham et. al.] The original data is binary, represented in 
the Infoset as xsd:base64Binary lexical.  The most optimized 
representation in the Miffy multipart is encoded as "binary", but some 
users of Miffy may not be capable of handling this.  Furthermore, there is 
asserted a need to convert from one encoding to another based only on the 
Miffy representation, I.e. with no knowledge of the XML or the Infoset. 
Accordingly, the proposal is to allow >> those encodings suitable for the 
representation of binary data<<, and to state that Miffy's that have the 
same data differing only in the encoding are semantically identical.

* [Anish Karamarkar, possibly et. al.]  Some binary data is known to be, 
for example, 7 bit clean at the source.   Consider, for example, a user 
that created an XML element from a text file known to be 7bit clean.  If 
one is willing to claim at the xsd typing level that the element is in 
fact base64binary, then the octet stream comprising the 7 bit text is 
represented in the Infoset in the 30% larger base64binary lexical form, 
but per the Miffy spec is promptly converted back to its original octet 
stream for transmission in Miffy.  I believe the suggestion is that in 
such situations where you know that the "binary" is in fact 7 bit text, 
that you should be able to set the encoding accordingly.    I suppose 
there is also a question of whether all this should require you to claim 
int he data model that xsd:base64Binary is being used, or whether some 
other type should be allowed.

My main point here is to suggest that both of these have been discussed, 
that they are different, and that if so we need to keep straight which 
proposal we are discussing at any point in time.  Also, each of these has 
a variation that says:  "Variability allowed by Miffy, but not by the new 
HTTP binding."

I think both of these go somewhat beyond the original mandate of MTOM.   I 
can more easily see the rationale for Mark's proposal, but have a fairly 
strong opinion that if we go there it be for Miffy but not for our HTTP 
binding.  I really want to maximize interop of the HTTP bindings, and 
minimized the code that's required to achieve such interop.  Since HTTP is 
binary-clean in any case, allowing variability there would just require 
extra code in conforming implementations, more interop testing, etc.  In 
practice, I would expect interop to be reduced.  So, I would either 
disallow variability everywhere, or allow choice of binary, not text, 
representations in Miffy and (a) state in Miffy that each application of 
Miffy must specify the allowed encodings allowed and (b) allow only binary 
in the http binding.

I guess I don't quite have a comfort level with Anish's proposal, but I 
may be missing something.  It seems to me that Miffy and MTOM are mostly 
about binary data, and 7Bit is about text.

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Tuesday, 13 January 2004 18:39:08 UTC