LC-150: XML Schema Language Last Call Issue Response: Array Sizes

                                        October 13, 2000

Dear Ms. Hunter,


The XML Schema Working Group has spent the last several months
working through the comments received from the public on the last-call
draft of the XML Schema specification.  We thank you for the comments
you made on our specification during our last-call comment period, and
want to make sure you know that all comments received during the
last-call comment period have been recorded in our last-call issues
list (http://www.w3.org/2000/05/12-xmlschema-lcissues).

        I am writing on behalf of the XML Schema WG 
concerning your last call comments concerning the issue of array 
specifications in XML Schema.  This issue is known to the XML Schema 
WG as LC-150. array-size: Allow specification of size constraints 
in instance?  

        See also the discussion of LC-84: Arrays, 
(
http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000OctDec/0011.html
)
part of which is recapitulated and elaborated here.  
That message also begins a thread which includes Henry Thompson's 
detailed exposition of the the Schema WG's position on why 
markup rather than whitespace delimiters are most appropriate for 
array encoding.
See:
http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000OctDec/0015.html


        Please read the following discussion of the issue and
the Schema WG response. At the close of this email you will find
instructions on how to respond, indicating whether the Schema WG
response is satisfactory to you.  The response will be forwarded
to the W3C executive director as part of the process for taking
Schema spec to Candidate Recommendation.



LC-150. array-size: Allow specification of size constraints in instance?
-------------------------------------------------------------------------

Issue Class: D Locus: both Cluster: 15 arrays Status: resolved
Assigned to: Frank Olken Originator: Jane Hunter (MPEG-7)

Description
-----------

Should XML Schema be modified to allow the specification of size
constraints (e.g. for a series of elements representing an array, or for
a list of tokens representing an array)? If so,
should the ability to specify size constraints in the instance always be
available, or be available only if the schema author calls for it?

Interactions and Input
----------------------

Cf. X3D-related comments on Schema datatypes

Input from Jane Hunter:

3. Parameterization of Array and Matrix Sizes
We would like to define the size of lists (arrays and matrices) 
at the time of instantiation. In the example below we suggest 
a valuePar construct which gives the name of the
attribute whose value will be used for the facet. 
The attribute data type must match the facet data type. 
This example is currently problematic because facets only apply to
simple types and attributes can only be added to complex types. 

The other issue is whether VectorI is being restricted or extended?

<simpleType name="listOfInteger" base="integer" derivedBy="list">

<complexType name="VectorI" base="listOfInteger" derivedBy="extension">
<length valuePar="Size"/>
<attribute name="Size" type="nonNegativeInteger" use="required" />
</complexType>


Input from Don Brutzman <brutzman@nps.navy.mil>:
Don Brutzman <brutzman@nps.navy.mil> to XML Schema Comments list 
on Fri, 12 May 2000 20:35:59 -0700 

2. X3D lists of floats/doubles/integers are often lists of 
2-tuples, 3-tuples or 4-tuples. Such typing is commonplace for 
3D graphics (e.g. translations are 3-tuples, orientations
are axis-angle 4-tuples). Regular-expression patterns will let us 
express these relationships (hopefully without redefining the 
numeric base types, not yet sure). No draft
schema appears in the current SVG draft - pertinent examples are
welcome. 
A helpful facet might be to specify the tuple-ordinality of a list type, 
so that only appropriate multiples of the typed data are allowed. 
Please be aware that wrapping such X3D
tuples in their own type tags has been considered, but is impractical 
due to unneccesary redundancy and the extremely large volumes of numeric 
data involved in many scenes. 


Resolution
-----------

XML Schema Language V 1.0 will provide 

        1) no array or vector data types
        2) no support for indicating the size of "lists".
        3) no support for "lists" (simpletypes) of k-tuples.

Item 2 was discussed in a conference call of 2000-06-29.
It was viewed as a corollary of the basic decision to reject
special constructions for arrays. 

Several versions of the array proposals were considered and rejected
(for quite different reasons).  Specifically, we separately considered
extending "lists" (a simple datatype), and "array type constructors" 
(a complex datatype).

Concerning "lists of lists" or "lists of non-atomic datatypes"
the Schema WG firmly decided that it did not want to go in that
direction
at all.  "Lists" were included in XML Schema as a minimal 
generalization of legacy constructions for NMTOKENS and IDREFS, etc.
The general view of the WG was that simple datatypes (suitable for
describing attributes) should ideally be restricted to atomic values.
More complex constructions (lists of lists, lists of tuples or vectors)
should be constructed as "complex datatypes", i.e., using nested
element markup constructions in XML.  This topic was discussed
in the Schema WG under the guise of Last Call Issue 102 (LC-102)
Microparsing Support, which was discussed (and rejected) at the
Edinburgh Face-to-Face meeting.   Note that the Schema WG is well
aware (as some commentators suggest) that "complex datatypes" 
for arrays are likely to be somewhat more lengthy encodings, 
this was not viewed as a compelling argument against them.

        In this context, "maximum size" specifications for lists were 
seen as another step down the slippery slope of elaboration of lists 
and (as I understand it) rejected (for lists) largely on these grounds.  
Elaboration of lists has been repeatedly discussed in the 
Schema WG under various guises and it appears that the Schema WG is 
quite firm on this decision.

The Schema WG position on array type constructors as complex 
datatypes was more moderate.  The Schema WG was not convinced
that such a constructor should be added to Version 1.0 of the
XML Schema.  The rationale was that the WG was not convinced
that the additional complexity was necessary, since conventional XML
markup facilities could specify the dimensions, and the 
array content could be a sequence of <arrayElement>'s (possibly
containing nested <array>'s. This decision should be seen in 
light of a variety of numerous comments which have been made to the
Schema
WG that the XML Schema Language  is already too baroque. 

Once again the issue of maximum size of arrays was rejected 
as a corollary of declining to support arrays.

Some of the Schema WG members argued unsuccessfully that such an 
approach failed to adequately convey the array semantics in a 
standardized fashion.  Also, standardized array syntax would 
facilitate query language operators specific to arrays, e.g., 
operators to extract rows, columns or other subarrays.  
However, the XML query language WG has not expressed such concerns.
It is possible that the Schema WG might be persuaded
to revisit this aspect of the issue in later versions of 
XML Schema (see discussion below concerning XML Protocol Work Group).

Finally, there was agreement, that the Schema WG (and other interested
parties) should convene a task force to consider the development
of common complex datatypes - e.g., perhaps complex numbers, arrrays,
etc.

Additional Considerations:

        Concerning the array size proposal two additional reasons for
rejecting the proposal were given by some Schema WG members.

        1)  A reluctance to permit instance-level declaration of
            "new" types  (i.e., specific array sizes).

        2)  A reluctance to support co-occurrence constraints,
            i.e., if there is a size constraint, then the array
            length must match.

Note that most of the Schema WG members who rejected the array 
size proposal for these reasons did so for reasons of limiting
the complexity of Version 1.0 of Schema Language, not necessarily
being permanently opposed to such developments.



To summarize the position of the Schema WG is:

        1) arrays as simple datatypes - not now, not ever.
        2) arrays as complex datatype constructors - not now, maybe
later
        3) array sizes for simple types - not now, possibly not ever
        4) array sizes for complex array types - not now, maybe later
        5) simple list datatypes of tuples - not now, not ever


Subsequent to the decisions of the Schema WG, a new XML Protocol
WG (URL: http://www.w3.org/2000/xp/) has been chartered by the W3C.  
It will meet later this month.  David Fallside (IBM) (email:
fallside@us.ibm.com) is the chair of the new WG. He has stated
that this WG will likely take up the issue of specifying arrays, because
this is needed by RPC protocols (e.g., SOAP) which permit the 
transmission of arrays.  See the W3C note on SOAP
(URL: http://www.w3.org/TR/SOAP/ ) Section 5.4.2. on Arrays.
Arrays would thus initially emerge in the 
XML Protocols Requirements Document.  Hopefully, the Protocol WG array
efforts will be coordinated with the Schema WG, e.g, perhaps as
part of Schema Version 1.1.  

Additional Points
-----------------

Jane Hunter asks whether adding an additional element for 
array size should be seen as a restriction or extension.
I believe that in the current design, it would be viewed
as an extension, as the size constraint is additional information.
The size constraint would not be enforced by a validator.

Apropos Don Brutzman remarks - see Henry Thompson's response
cited above on the merits of markup vs whitespace separation.
(
http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000OctDec/0015.html
)

Brutzman's proposal to use regular expression to express
the structured array datatypes encoded as attribute values
can probably be made to work.  Most of the members of the
Schema WG would view such a development with unalloyed horror,
as an inappropriate use of regular expressions.

I reiterate Henry's remarks that failure to mark up individual
array elements makes it very difficult to process individual
array elements with XSLT or to reference them via XPATH.


Is this response adequate ?
------------------------------

The XML Schema Working Group wants to know your opinion
of our response to your last call comments.  This information
will be included with the package submitted to the W3C
Executive Director as part of the recommendation to take
the XML Schema Language to Candidate Recommendation.
We would appreciate your response as soon as possible.

Please choose from one of the following responses, adding 
whatever details, explanation you wish:

1)  "GOOD ENOUGH"  - You are satisfied with the Schema WG response
to your comments on XML Schema Language.  The response meets 
your requirements.  The matter may be considered resolved.

2) "STOP THE PRESSES"  - You are not happy with the response
to your comments on XML Schema Language.  Either the response
is unclear or inadequate.  The issue is of sufficient importance
and urgency that you want it called to the attention of the 
W3C Executive Director and you ask that the XML Schema Language 
delayed in advancing to Candidate Recommendation until the 
issue is resolved. 

3)  "LATER - VERSION 1.1"  - You are not happy with the response,
but are prepared to defer reconsideration until XML Schema Lang.
Version 1.1 is drafted.  It is anticipated (hoped) that Version 1.1
will be completed by mid-2001.  Version 1.1 is intended primarily
to fix small issues needed by other W3C Working Groups to proceed 
with their work (especially XML Query Language).  You request that
your comments be reconsidered when drafting the Version 1.1 
requirements document.

4) "LATER - VERSION 2.0"  - You are not happy with the response,
but are prepared to defer consideration until XML Schema Language
Version 2.0 is drafted.  It is anticipated that Version 2.0 would
not be completed until late 2001 or early 2002.  Version 2.0 may
include major revisions, e.g., multiple inheritance, etc.
You request that your comments be reconsidered when drafting the 
Version 2.0 requirements document.

5) "NO LONGER CARE"  - You are not happy with the response, but
no longer care to pursue the matter, because ....



                  Frank Olken
                  XML Schema Language Working Group

  Lawrence Berkeley National Laboratory   (510) 486-5891 (voice)
  Mailstop 50B-3238                       (510) 486-4004 (fax)
  1 Cyclotron Road                        (510) 843-5145 (home)
  Berkeley, CA 94720, USA                 (510) 442-7361 (pager)

  E-mail:  olken@lbl.gov
  WWW:     http://www.lbl.gov/~olken/

Received on Friday, 13 October 2000 14:47:08 UTC