Re: latest proposal on issues #144 and #161 - array encoding from Alan Kent on 2001-11-20 (xml-dist-app@w3.org from November 2001)

From: Alan Kent <ajk@mds.rmit.edu.au>
Date: Tue, 20 Nov 2001 15:02:02 +1100
To: xml-dist-app@w3.org
Message-ID: <20011120150202.J10991@io.mds.rmit.edu.au>
On Tue, Nov 20, 2001 at 04:03:51AM +0100, Jacek Kopecky wrote:
>  4) added a paragraph to 4.4.2.1 stating that the meaning of
> untransmitted members in partially transmitted arrays is
> application and implementation specific.

I probably sound like a broken record, but I personally feel its bad
for interoperability for a spec to define a concept and then say
'but its application specific.' The spec should completely
define what is within the scope of the spec and not specify
anything that is not within scope of the spec.

The reality is SOAP toolkits are built to handle the SOAP spec.
If the toolkit implements the semantics, then the application
does not get a choice. If the application wants implements the
semantics, then the toolkit must support 3 states for each slot
in all arrays: null, omitted, or a real value.

This can be done, but is not efficient and makes the toolkit APIs
very messy for what I claim will give benefit to very few real
life applications. Worse, the spec says a protocol implementation
is allowed to translate an omitted value into null, meaning
any application using that toolkit may fail to interoperate
with another application using a different toolkit.


I personally would like to see p-t-a/sparseness disappear completely.
But that is not going to happen I suspect.


How about at least improving the world and saying something like
'omitted elements in an array is the same as that element having
a null value' and 'it is up to an application to decide how to
treat null values.'

That is, using xsi:nil="1" is identical to omitting
the array value using position/offset. This is much easier to
implement in toolkits (many languages have a null pointer concept).
Toolkits are responsible for handing over a consistent data type
to applications. The interpretation of that data type is application
specific.

I do not believe a SOAP toolkit implementation should
be permitted to apply interpretation to a SOAP message.
I think interpretation should be completely an application concept.
I think my above proposed change in direction means there is
a clear definition of what a toolkit implementation should do,
while leaving a degree of interpretation up to the application.

Eg: for the SOAP interop tests, we let all the arrays may contain
nulls. The "application" logic is to echo what is received straight
back. Without this, array interoperability testing is (well if
I send a p-t-a with an ommitted value in it, don't check what
comes back as anything what-so-ever is acceptable). Yuck.

Further, XML Schemas can define is legal for an element to be nil or
not (the nillable attribute), so I assume it can be defined using
XML schema whether values in an array can be nil or not. This
gives, for example, a toolkit processing a WSDL file generating
C/C++ the option of using an array of integers or an array of
pointers to integers. If the array type does not allow null
values in the array, then if a value is omitted, its a fault.
Offset/position values can still be used, but as long as all
slots in the array are populated.


If the above is not done, then if you are saying a major reason
for having p-t-a arrays is that the full array may be very large
and you want to reduce the data on the wire, then implemntations
should really also worry about the size of the in-memory data
structure representing the array. To provide a consitent API to
applications, this means that all arrays must not be bound
directly to the programming languages native array type - the SOAP
toolkit must implement its own array type allowing omitted values
to be efficiently represented in memory. This is certainly possible,
but very ugly (in my opinion).


So I strongly put forward that omitted values in arrays (and omitted
parts of responses) be explicity stated as being the same as nil
in the SOAP standard. It is *purely* an application concept what
the interpretation of a nil is (just as its an application concept
of what any other valid valid means). But as soon as you say
"here is how you encode something which we explicity do not define
what it means" then that concept is by definition not interoperable
within the spec. You must come to agreements outside of the spec.
So a secondary specification or agreement is needed in order
to define interoperability. Yuck.

With the current proposal I think the only satisfactory implementation
of arrays is to support three states per array element (omitted,
nil, or contains a valid value). If most protocol implementations
treat omitted values as null, then applications get no benefit
from the 'ommitted' concept so carefully introduced. Since this
is already the norm, I vote 'omitted values are identical to
null values'.

Ok, I am starting to rave. I will stop.

Alan
Received on Monday, 19 November 2001 23:02:35 UTC