Re: Issue 231 options from Ray Whitmer on 2002-09-05 (xml-dist-app@w3.org from September 2002)

From: Ray Whitmer <rayw@netscape.com>
Date: Thu, 05 Sep 2002 10:46:17 -0700
To: XMLP Dist App <xml-dist-app@w3.org>
Message-ID: <3D779869.8020806@netscape.com>
Jacek Kopecky wrote:

> Hi all, 8-)
> this email is meant as a summary of the options we have for resolving
>LC Issue 231 [1].
>  
>
Thanks.  Below are my own point of view on these alternatives.

> Now the issue 231 is mainly about how the receiver distinguishes that
>something is an array. We have the following options here:
>
>        1. Doing nothing, relying on the receiver that it know what the
>           incoming data really is (as most implementations do). In this
>           case we should clarify whether array members' accessor names
>           have to be the same or not. The spec currently says these
>           names are irrelevant, so it seems like they need not be the
>           same.
>  
>
I believe this is my preferred option.  This is based upon my belief 
that it is impossible to reliably interpret messages or even bring them 
in and out of the data model without some sort of schema describing 
them.  There are plenty of types beyond arrays for which this is true. 
 The call itself cannot even begin to be properly interpreted -- despite 
efforts to identify the return.  Those who do not believe this probably 
should not accept this answser.

>        2. Mandating arraySize (or itemType, but the former is
>           preferable because it has a default value now) like SOAP 1.1
>           mandated arrayType - that way a receiver will always know an
>           array when it sees one.
>  
>
Mandating itemType would introduce yet another mandatory reliance on an 
XML Schema type to specify anyType, which I think is especially not 
desirable in this case -- I have fought to keep cases that mandated 
xsi:type out of the spec and this is similar.  The schema may have its 
own implicit typing that make it unnecessary to explicitly have the item 
type.  So I agree we should choose arraySize instead of itemType if we 
find it necessary to enforce the ability to distinguish, and I would not 
complain much.

>        3. Mandating that array members' accessors all have the same
>           name. This would unify arrays, structs and generics but then
>           the array attributes would be sticking out; also the receiver
>           would have to see the whole data to be able to differentiate
>           between an array and a struct or a generic, and this
>           distinction would still be impossible when the given compound
>           type is empty.
>  
>
This is not even a working solution for arrays, structs, or generics 
with fewer than two entries.  This defeats the ability to distinguish 
types by element names that are associated in a schema with types.  It 
mandates use of xsi:type, which is bad.  I would also argue that the 
fundamental breakage this causes is far greater than what can be 
legitimately considered in last call.

> There is a related issue of distinguishing an empty compound type from
>a simple type (like an empty string).
>
> I think this all boils down to two orthogonal questions that cover all
>the issues above:
>
>        A. Should SOAP Encoding serialization produce self-describing
>           XML? (self-describing in terms of the data structure)
>
It should be permitted by the serializer, but not required of the 
serializer, as I think was the original idea in SOAP.  Otherwise, the 
SOAP spec has to change for other user-introduced types as well such as 
simple types.

You could argue that there is currently a hole in that you cannot 
distinguish between generics and structs, and the application might 
erroneously ascribe importance to the order of the children when there 
was none (or vice versa) if it chose the wrong representation.  This is 
certainly not the only problem if you operate without a schema.

>        B. What is the relationship of generics, structs and arrays?
>  
>
See below.

> If the first answer is "no" then we may need a schema language to help
>the deserializer (see for examply my message [2]) or we can rely on
>external means, but that should be said explicitly.
>  
>
I think my use cases clearly need a schema language, as is reflected in 
my implementation, but the serializers should be free to produce 
self-describing where it is useful.

> Now the second answer has (at least) two variants:
>
>   a. there are compound types, some of them structs, some of them arrays, 
>      all of them can be treated as generics
>
Not in the case of using an XML Schema.  In the case of an xml schema, I 
think it must be a subtype of the array or struct type, or it must be a 
generic, even if it would have made a proper array or struct.  As 
written, the SOAP spec cannot directly address this, but it seems like 
the right way to apply XML schema to the spec.

>   b. there are compound types of three distinct kinds: structs, arrays and
>      generics
>  
>
Structs and arrays are, IMO, at the schema level defined as 
mutually-exclusive specializations of generics (which are the general 
complex type), which will be special cased in language bindings to 
implement those special types in languages.  Each sub-type loses one 
access method (indexed or named) of the generic supertype, so the 
subtype is not a true instance of the supertype.  For example, if a 
struct were represented as a generic (or worse, as an array) the 
application might erroneously ascribe importance to the order of the 
elements inside.

> Because this is undecided, we have an issue about removing generics.
>IMO the presence of array attributes (arraySize, itemType) indicates
>that arrays and structs cannot be treated the same way, therefore
>pointing to answer b) above. Anyway, answering question B affects how we
>resolve all these issues.
>  
>
I might favor of removing generics.  We had one good use case for them, 
the RPC call itself, as was pointed out in last-call comments, but that 
is not on the table.  But I think there are those who may have good use 
cases for them, and I think it is too late to pull them as long as we 
get interoperable implementations to get us past CR phase.

Ray Whitmer
rayw@netscape.com
Received on Thursday, 5 September 2002 13:46:50 UTC