Re: Summary of Issue 194 - encodingStyle from Ray Whitmer on 2002-04-23 (xml-dist-app@w3.org from April 2002)

From: Ray Whitmer <rayw@netscape.com>
Date: Tue, 23 Apr 2002 10:47:52 -0700
To: xml-dist-app@w3.org
Message-ID: <3CC59E48.1010205@netscape.com>
Jacek Kopecky wrote:

> Ray, 
> I'm sure Henrik will reply, too, but I have a few things to say.
>
> 1) We cannot compare the namespace declarations (or xml:base) 
>with encodingStyle attribute because both namespace declarations 
>and xml:base are only used where requested (by a prefix or by a 
>relative URI), whereas this "where requested" is not at all clear 
>on encoded data; it's clear where you need to know the binding of 
>a namespace prefix, but how do you know from some data that it's 
>supposed to be encoded or not?
>
Actually, some places you need to know a namespace binding are clear, 
and others are not.  I am not sure that it is ever clear that you need 
xml:base.  Especially not at the raw infoset level.

These may affect the interpretation of data, which may be described by 
DTDs XMLSchema, etc. or understood natively by the application, and 
where it affects the interpretation of data (xml:base never affects 
anything else) it is clearly outside of the scope of the infoset, so it 
is quite possible that one application will choose to honor it in some 
places, while others will not.  As has been decided for the SOAP 
specification, whether there is any real schema type there and, hence, 
how something will be interpreted, may vary from one point to the next.

Our implementation, at the low level, has two modes: raw and encoded, 
and the application may choose to specify or ask for it in either form. 
 If raw, it deals in DOM (almost infoset) nodes, if encoded, it deals in 
objects native to the binding, which can only be supplied if there is an 
encoding attribute in-scope on the node in question.

At the higher level, it is assumed that you are only dealing with 
objects native to the binding, which means for automatic high-level 
mappings as guided by WSDL, so you have to have an explicit encoding.

>2) Changing encoding in the middle of a graph does indeed seem 
>useful, but I have yet to see a clear case where it's necessary 
>_and_ it cannot be done without the encodingStyle attribute. This 
>is related to the following.
>
The encodingStyle attribute was introduced for that purpose.  It seems 
like the proper way to do it.

If we had ever had some actual architeecture design work to guide us 
here, we clearly wouldn't ask at this late date why it is there, but we 
have designed unnder the assumption that this is how it is intendeed to 
work, and it works well enough.  There would be a vacuum without it.

>3) Any implementation has to know what encoding is used if it 
>wants to help the application by not handing it the raw XML data, 
>that's true. But the usual case is that the application knows 
>what kind of encoding (well, data model in fact, but I don't 
>expect multiple similar encodings for a single data model) the 
>data should be in in the various places where data is expected. I 
>don't know of any application that does care about encodings 
>_and_ does not know what kind of data it's going to get.
> So instead of the encodingStyle information coming from the 
>message, it can come from the application. Moreover, in the 
>strongly typed languages you cannot deserialize any kind of 
>object into any kind of holder, like when switching to a "raw 
>xml" encoding from SOAP Encoding, you'd deserialize into a DOM 
>tree but it wouldn't fit into the holder for a HashMap, for 
>example. In this approach, errors in data (encodings) are 
>discovered when deserializing, not long after that or even never.
>
You cannot expect any arbitrary model to substitute for any other 
arbitrary model.  But there are a number of models that are likely to be 
interchangable for specific classes of applications or to interconnect.

If the encoding should come from the application, then what business has 
any SOAP specification ever had talking about anything beeyond the 
application?  But everyone has argued with me that we need standard 
architectures and encodings so that this type of job could be taken care 
of in the pipe, not by the application.

You can, for example, easily contemplate decoding/decoding SOAP 1.1 or 
SOAP 1.2, which will give you reasonably similar results -- close enough 
to accept or emit either.

I suspect it may also be reasonable to contemplate encoding/decoding RDF 
as a substitute in some bindings or applications for the default SOAP 
binding.  This seems to be in the spirit of the description of the 
enncoding in SOAP 1.1.

It also seems reasonable to have a mixed model with edges between 
different types of encodings.  Something of this sort will be required 
for SOAP with attachments, as well as anything else that doesn't fit 
into the default encoding.  If the application controls this, then let's 
get it all out of the specification, and do not pretend that 
applications will be interoperable by some meta-arrchitecture.  It will 
be by arrangement and schema, as it has always been for applications 
exchanging XML.

Ray Whitmer
rayw@netscape.com
Received on Tuesday, 23 April 2002 13:47:45 UTC