Re: SOAP Encoding multistructs from Asir S Vedamuthu on 2002-02-20 (xml-dist-app@w3.org from February 2002)

From: Asir S Vedamuthu <asirv@webmethods.com>
Date: Wed, 20 Feb 2002 12:15:54 -0500
To: "Jacek Kopecky" <jacek@systinet.com>, <xml-dist-app@w3.org>
Message-ID: <011501c1ba32$42b16e20$4813030a@webmethods.com>
The name multi-struct is a misnomer. Per the current specs, it is the
generic compound type.

> I claim that B3 together with B4 cover all
> the useful cases of A3 and that there

I don't believe they cover all the useful cases. 'Cos the 'useless' case
might be useful to other applications.

>  Both remappings would mean complicating the
> XML serialization but the data would be so much
> more clear about their meaning.

Yes, it will **complicate** the status quo simplicity and **disturb** the
existing simple rules for serialization.

> Asir's given me the argument that with the
> current syntax (same for B3 and B4) the receiver
> can choose to approach the data
> differently from how the sender approaches them

Slightly different. It is not the receiver nor the sender. It is the
application that determines the type of indexing.

> how's that for interoperability?

What is the interoperability issue? And, how is it related to SOAP 1.2?

Generic compound types are the only way to map RDF bags and sequences.
Removing generic compound type severly constraints the SOAP data model.

We are not convinced that generic compound type is an issue. webMethods
strongly favors the status quo. We have an implementation and have not seen
any interoperability issues.

Regards,

Asir S Vedamuthu

webMethods, Inc.
703-460-2513 or asirv@webmethods.com
http://www.webmethods.com/

----- Original Message -----
From: "Jacek Kopecky" <jacek@systinet.com>
To: <xml-dist-app@w3.org>
Sent: Wednesday, February 20, 2002 11:38 AM
Subject: SOAP Encoding multistructs


Hi all. 8-)
 This is an issue that was brought up a few times but never
really discussed.
 In SOAP Encoding (and the Data Model) we have the notion of a
compound type which contains some members. The members may be
accessed via their names or ordinal position or both.
 We have three or four ways of accessing the members:
 A) 1) by name, 2) by position, 3) by both;
 B) 1) by name, 2) by position, 3) by name and then by position,
4) by position and then by name.

 I assume B is the case because IMO there are enough differences
between B3 and B4 to make them distinct and I claim that B3
together with B4 cover all the useful cases of A3 and that there
is no overlap between them.  The "useless" cases of A3 not
covered by B3 and B4 are the cases where the application changes
its approach significantly and arbitrarily while processing a
single multistruct.

 In B1, the order of members in the XML serialization is
completely insignificant, the names carry information.
 In B2, the names of members in the XML serialization are
completely insignificant, the order carries information.
 In B3, the relative order of two elements with the same name is
significant, while the relative order of two elements with
different names is disregarded; the names are significant.
 In B4, the order of all the elements is significant, so are all
the names.

 Let's see an example of a serialized multistruct:

   <multistruct>
     <a>1</a>
     <b>2</b>
     <a>3</a>
     <c>4</c>
     <a>5</a>
   </multistruct>

 The difference between B3 and B4 is in the order of choosing by
the position and by the name if a service wants a member with the
name 'a' and position 2.
 B3: choose all 'a's, then choose the second one. Result:
<a>3</a>.
 B4: choose the second member (<b>2</b>) and ensure it's an 'a'.
(Actually, a more real scenario would be to get the second
member, choose the action based on its name, then process the
value.)

 My opinion is that in B3, the multistruct would be better
described as a structure containing array of 'a's, array of 'b's
and an array of 'c's.
 In B4, the natural mapping of the data (on the data-model
level) would be to an array of tuples {name, value}.

 Both remappings would mean complicating the XML serialization
but the data would be so much more clear about their meaning.
(Ultimately this is the same for representation of sparse arrays
and partially transmitted arrays and references to attachments or
other stuff, since we removed the incomplete arrays and hrefs.)

 Asir's given me the argument that with the current syntax (same
for B3 and B4) the receiver can choose to approach the data
differently from how the sender approaches them. But this is
hairy because if the sender has data modeled as B3, it will view
my first example and the following one as equal and may choose a
random one, whereas a B4 receiver will see them as different and
will possibly treat them differently with different results -
how's that for interoperability? It's like if an application was
free to treat a struct like an array - same ordering issues.
 The second example:
   <multistruct>
     <a>1</a>
     <a>3</a>
     <a>5</a>
     <b>2</b>
     <c>4</c>
   </multistruct>

 Here follow my proposals:

 I) remove multistructs completely (keeping structs and arrays
whose combinations can be used to model any B3 and B4
multistructs and arguably any A3 multistructs) and only allow
accessing members either by name or by position, not both,

 II) distinguish between B3 and B4 multistructs just like we
distinguish between structs and arrays, for example by mandating
that B3 multistructs have type descended from enc:Multistruct,
and B4 multistructs have type descended from
enc:DocumentOrderStruct.

 I strongly favor the first proposal, I cannot live with keeping
the status quo ("none of the above") proposal.

 There is a precedent: we were faced with such two choices in the
sparse arrays debate (distinguish the different treatments or
remove them entirely) and we chose the path of simplification -
removal.

 Best regards,

                   Jacek Kopecky

                   Senior Architect, Systinet (formerly Idoox)
                   http://www.systinet.com/
Received on Wednesday, 20 February 2002 12:18:26 UTC