Re: SOAP Encoding multistructs from Jacek Kopecky on 2002-02-20 (xml-dist-app@w3.org from February 2002)

From: Jacek Kopecky <jacek@systinet.com>
Date: Thu, 21 Feb 2002 00:38:54 +0100 (CET)
To: Asir S Vedamuthu <asirv@webmethods.com>
cc: <xml-dist-app@w3.org>
Message-ID: <Pine.LNX.4.33.0202210006290.3513-100000@mail.idoox.com>
 Asir,
 thanks for your comments. Please see my responses inline.

                   Jacek Kopecky

                   Senior Architect, Systinet (formerly Idoox)
                   http://www.systinet.com/



On Wed, 20 Feb 2002, Asir S Vedamuthu wrote:

 > The name multi-struct is a misnomer. Per the current specs, it is the
 > generic compound type.

There may be specific meanings of the term multistruct defined
somewhere, but I'd like to use this as a short for the longer
term generic compound type.

 > > I claim that B3 together with B4 cover all
 > > the useful cases of A3 and that there
 > I don't believe they cover all the useful cases. 'Cos the 'useless' case
 > might be useful to other applications.

Asir, I'm very interested in any case not covered by either B3 or
B4 that still falls into A3. I won't be persuaded that in
practice B3+B4!=A3 until I see such a case. I do admit my 
experience in the field of data structures is limited.

My further comments in this email will still be based on the
assumption that B3 and B4 are everything there is above structs
and arrays in our data model.

 > >  Both remappings would mean complicating the
 > > XML serialization but the data would be so much
 > > more clear about their meaning.
 > Yes, it will **complicate** the status quo simplicity and **disturb** the
 > existing simple rules for serialization.

What you probably mean by the "status quo simplicity" is the
simple syntax that's equal for B3 and B4, right? As far as I know
we don't have rules that explicitly handle multistructs therefore
there is nothing to be disturbed. The simplicity would be
complicated just as it is complicated when you do external
references or sparse arrays, it would be more complicated for the
sake of lesser ambiguity in handling. And, if we took the 
"removal" path, the actual SOAP Encoding would be simplified 
because representing a B3 multistruct as a struct of arrays would 
lay in the application domain, not in the Encoding (again, 
likewise for sparse arrays and external references).

 > > Asir's given me the argument that with the
 > > current syntax (same for B3 and B4) the receiver
 > > can choose to approach the data
 > > differently from how the sender approaches them
 > Slightly different. It is not the receiver nor the sender. It is the
 > application that determines the type of indexing.

If it's the application, assuming both sender and receiver are 
part of the application, they will always handle the data in the 
same way, right? Then the first proposal (of making B3 and B4 
explicitly different just like arrays and structs are explicitly 
different even in the serialization) would be OK for you?

 > > how's that for interoperability?
 > What is the interoperability issue? And, how is it related to SOAP 1.2?

The interop issue was based on the assumption that the receiver 
handles the data as B4 whereas the sender handles it as B3 in 
which case the sender may get different results for seemingly 
(from the sender's POV) equal requests.

 > Generic compound types are the only way to map RDF bags and sequences.
 > Removing generic compound type severly constraints the SOAP data model.

Why cannot RDF bags and sequences be mapped to arrays of structs 
or structs of arrays? I'm trying to show here that (and again, 
just as with sparse arrays and external references) we're not 
loosing anything, we're just moving stuff up a layer to where it 
belongs. 

 > We are not convinced that generic compound type is an issue. webMethods
 > strongly favors the status quo. We have an implementation and have not seen
 > any interoperability issues.

What I'm trying to do by removing generic compound types from the
Data Model and from the Encoding is removing redundancy. So far I
am convinced that multistructs can be naturally represented as
arrays and structs.

 The basic building blocks that we need are
 1) simple types (no problem with this one), 
 2) arrays - because we want to be able to represent naturally a
sequence of values, whose size may not be known at the design
time. This can be done with a linked list if we only have structs
but the notion of an array is basic to almost every data model 
and the linked list is unnatural for the task (I may elaborate 
if asked to).
 3) structs - because we want to be able to represent naturally a
set of different values. This could be done using arrays, too,
for example by naming the struct's members with numbers from 1 to
N, or by having an array of arrays of size two - the first item
is a string with the member's name and the second item is the
value - but again, this is unnatural in XML.

 It can be argued that a multistruct has the most natural mapping
to XML indicated by the status quo. But this mapping is ambiguous 
on its meaning (B3 or B4?) and IMO uses for multistructs are few 
and relatively small so we can afford the price of forcing the 
users to rewrite their applications for arrays of structs or 
structs of arrays.

I can't help myself but to mention again the case of external
references and sparse and partial arrays which all were removed
from SOAP Encoding even though the former form was the more
natural for them. They were removed to *increase* simplicity of
implementation and clarity of the handling rules. Although the
resulting XML is a bit uglier (which can be helped by custom
encoding styles, e.g. RDF using their already existing XML
representation), the system is much more maintainable in the long
run.

Oh, and we decided not to add maps and other containers for these 
reasons, too.

Jacek
Received on Wednesday, 20 February 2002 18:38:56 UTC