RE: updated proposal on issue #144 - array metadata in SOAP Encod ing

>>From: Jacek Kopecky [mailto:jacek@systinet.com]
>> I think the issue here is the true meaning of the term "sparse
>>array".
>> In math we have the term "sparse matrix" meaning "a matrix
>>containing mostly zeros". Mostly is a very vague word, I am not
>>aware of any stricter quantification of how many zeros must a
>>matrix contain to be considered sparse.
>>The same with SOAP arrays - how many members can be present for
>>the array to be still considered a sparse array?

>>> [Murali] 

  I don't think we need to worry about the degree of sparseness. A
specification is only expected to provide a means to do something, if that
something is identified to be valid.

  As a tools provider, I don't really make use of sparse arrays. However, I
have made an attempt here to try and provide a couple of use cases for
sparse arrays.

Use case 1:

  I think Sparse arrays as a separate type or as a data structure is used
mostly in the areas of representing spatial data, sonar, radar and
ultrasound imaging. They use sparse arrays because as you stated most of the
values are known or constant(0) or below or above some threshold. 

  In these cases, the user knows for sure it is a sparse array and they
benefit from using a specific implementation. They typically use separate
data structures and there are many algorithms and papers on how they can
optimize here. It seems to me that these use cases would benefit from
encoding an array as a sparse array as a sender of such an array would
expect the receiver to take advantage of the optimization that comes with
it.   

  It also seems to make sense in this case to make sparse array as a
separate type.

Use case 2:

  It seems that there are cases where a sender holds a regular array but
tries to encode only a few elements of the array (not contiguous elements -
otherwise it becomes partial) for what ever reasons that doesn't matter
here.
  
  It also seems from some of the discussions in this list that there are
cases where a sender doesn't know before hand that the resulting
representation of the array is going to be sparse (Rich - is this your use
case?) 

  A possible example could be a monitoring system that is monitoring an
array of sensors and sending out different sensor values only if they are
above some threshold.

  In a dynamic situation like this where the input feed is dynamic and the
output is streamed it is not possible to stamp the array as a sparse array
before starting to encode. Note again that for this to be true the output is
streamed meaning it has left the sender before the full array is encoded. 

  In this case a separate type for sparse array complicates matter and so
does an attribute that calls the array as a sparse array.

Discussion:

*  It seems to me that use case 1 is a valid case and users will benefit
from saying "enc:isSparse=Yes".

*  Use case 2 seems to indicate a new state called "enc:isSparse=maybe" a
sparse array. Part of the users falling in this use case will still probably
be able to say a clean "enc:isSparse=Yes".

*  I don't think there will be any arguments for a case where the answer is
"enc:isSparse=No" which clearly indicates a regular array representation.

   I draw the following conclusions from the above.

* There is a valid need for stamping an array as a sparse array.

* Sparse array as a separate type for encoding has it's own issues.

Proposal:

* SOAP 1.1 encoding rules for sparse arrays (where position should be
present for all elements) seems to work fine for both cases.

* As the recent proposal by Jacek has some advantages for some use cases and
if we are going to adopt that then it seems some of the user community would
be better served by,

     1, either adding a new attribute enc:isSparse or
enc:encodedSparse(prefer this name though not that particular) with 3 states
(yes, no, maybe). 

     2, or make the enc:encodedSparse attribute optional with a bool
state(true/false). The absence of such an attribute would indicate a 'maybe'
state.

I prefer the second as it leaves the burden to the user who really cares
about sparse arrays or regular arrays (penalize only those that require it).

<<< [Murali]

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

-----Original Message-----
From: Jacek Kopecky [mailto:jacek@systinet.com]
Sent: Tuesday, November 13, 2001 3:44 AM
To: Murali Janakiraman
Cc: xml-dist-app@w3.org
Subject: RE: updated proposal on issue #144 - array metadata in SOAP
Encoding


 Murali, but also Alan,
 I think the issue here is the true meaning of the term "sparse
array".
 In math we have the term "sparse matrix" meaning "a matrix
containing mostly zeros". Mostly is a very vague word, I am not
aware of any stricter quantification of how many zeros must a
matrix contain to be considered sparse.
 The same with SOAP arrays - how many members can be present for
the array to be still considered a sparse array?
 In my proposal the term "sparse array" is only used once and it
points to the section about "partially transmitted arrays".
 The problem is that one application can consider an array and an
other application can treat the same array as just an array. If
we add the attribute isSparse (or any other suitable name), what
do we communicate on the wire? It is a hint to some
implementations. It's like if we wanted to add an attribute on
transmitted integers that thay are in fact in the range -128 to
127 and therefore fit into a common byte:
 <a xsi:type="xsd:integer" enc:isByte="true">42</a>
 This comparison leads me to thinking that we might add a new
type, SparseArray, being a restriction of the type Array with
just the annotation "these arrays are sparse", just like we have
xsd:byte type.
 But how do we define a sparse array? If anybody comes with a
definition suitable for the spec, I think we could add the type.
 What d'ya think? 8-)

                   Jacek Kopecky

                   Senior Architect, Systinet (formerly Idoox)
                   http://www.systinet.com/



On Mon, 12 Nov 2001, Murali Janakiraman wrote:

 > Hi Jacek,
 >
 >   My comments are in >>>Murali<<<
 >
 > >>Jacek
 > >>in my opinion the entity that knows best the array are or are
 > >>not sparse is the application. The serializer cannot reliably
 > >>guess an array it's serializing is sparse unless it scans through
 > >>a significant portion of the array which may be very inefficient.
 >
 > >>> Murali
 >
 >   I am sorry, I disagree. I don't see any reason why any guessing (as to
 > what is being sent on the wire) has to be done on the sender side unless
the
 > sender side consists of improperly designed dis-joint pieces.
 >
 >   First, let me state here that, IMO, your distinction of serializer and
 > application is very implementation centric. From the point of view of
SOAP
 > there is no notion of serializer, there is just the SOAP defined wire
format
 > for SOAP defined data model and there is a sender who is sending an array
 > (or whatever) under SOAP defined encoding rules.
 >
 >   The sender (irrespective of the number of layers it is built up of -
 > serializer, application or whatever) knows what it is sending and that is
 > true for sparse arrays as well (in your layered approach if an
application
 > cann't communicate this to its serializer it indicates an issue with the
 > design of the serializer and it has got nothing to do with SOAP and its
 > encoding).
 >
 >   On the other hand, if the wire format of sparse arrays does not
 > unambiguously identify it as a sparse array, then the receiver has to do
the
 > guess work (ofcourse, only if the receiver wants to treat the sparse
array
 > differently)
 >
 >   So, the question seems to me, why don't SOAP that seems to be
supporting
 > sparse array as a first class citizen doesn't allow the sender to say it
so
 > in a SOAP defined way, so that the receiver can appropriately deal with
it.
 >
 > <<<Murali

Received on Tuesday, 13 November 2001 15:08:07 UTC