Re: [x3d-contributors] Arrays in XML Schema - Last Call Issue LC-84 - Schema WG response

Summary:  lack of schema support for complex-datatype arrays is a problem 
for VRML.  The X3D working group will attempt a regular-expressions 
workaround and report back.

Frank, thanks for your group's response back to us.  Joe has correctly
flagged a key concern for our group:

Joe D Willliams wrote:
> 
> > To summarize the position of the Schema WG is:
> >
> >         1) arrays as simple datatypes - not now, not ever.
> >         2) arrays as complex datatype constructors - not now
> 
> Does this message mean that some of these
> following data types from the spec cannot be
> supported in a 'legal' XML encoding supported
> by 'legal' schema?
> 
> Encoding of fields
> 
> PART 3               PART4
> SF/MFBool          SFBool

etc.

For anyone unfamiliar with these VRML 97 datatypes, you are welcome
to refer to
http://www.web3D.org/technicalinfo/specifications/vrml97/part1/fieldsRef.html

for 2-tuples and 3-tuples etc. of integers and floats, as well as
arrays of these 2-tuples and 3-tuples, all as native datatypes
 
[The following interspersed comments here are my opinions. 
 Next-step plans follow at the end.]

> ----- Original Message -----
> From: "Frank Olken" <olken@lbl.gov>
> To: "Jane Hunter" <jane@dstc.edu.au>; "Robert Miller"
> <Robert.Miller@gxs.ge.com>; "Don Brutzman" <brutzman@nps.navy.mil>
> Cc: "mpeg7-ddl" <" mpeg7-ddl"@darmstadt.gmd.de>;
> <www-xml-schema-comments@w3.org>; <w3c-xml-schema-ig@w3.org>; "X3D
> Contributors" <x3d-contributors@web3d.org>
> Sent: October 04, 2000 2:04 PM
> Subject: [x3d-contributors] Arrays in XML Schema - Last Call Issue LC-84 -
> Schema WG response
[...]
> > The Schema WG position on array type constructors as complex
> > datatypes was more moderate.  The Schema WG was not convinced
> > that such a constructor should be added to Version 1.0 of the
> > XML Schema.  The rationale was that the WG was not convinced
> > that the additional complexity was necessary, since conventional XML
> > markup facilities could specify the dimensions, and the
> > array content could be a sequence of <arrayElement>'s (possibly
> > containing nested <array>'s. This decision should be seen in
> > light of a variety of numerous comments which have been made to the
> > Schema
> > WG that the XML Schema Language  is already too baroque.

The rationale is clear, but arrays are such a common construct throughout
all computer languages and data representations that some solution ought 
to be devised.

Perhaps the baroqueness was a reaction to considering the nth case
(or n to the nth case) of embedded arrays.  In practice, most arrays
are not arbitrarily complex, and there are a lot of different compilers
& interpreters that parse even complicated arrays of complex types fine.

> > Some of the Schema WG members argued unsuccessfully that such an
> > approach failed to adequately convey the array semantics in a
> > standardized fashion.  

Such an argument nevertheless appears sound.

> > Also, standardized array syntax would
> > facilitate query language operators specific to arrays, e.g.,
> > operators to extract rows, columns or other subarrays.
> > However, the XML query language WG has not expressed such concerns.

Well perhaps this isn't clear,,, but lack of expressed concern by the
query-language folks regarding an unspecified (non)capability for arrays
would seem to be the expected (non)response.

> > It is conceivable that the Schema WG might be persuaded
> > to revisit this aspect of the issue in later versions of
> > XML Schema (see discussion below concerning XML Protocol Work Group).

Revisit/revise sounds like a smart course of action.

> > Minor points:
> >
> >         As noted the WG generally frowns on compound simple datatypes,
> > hence:
> >
> >         <dimensions>
> >                 <dim> 2 </dim>
> >                 <dim> 4 </dim>
> >         </dimensions>
> >
> > would be preferred to the syntax you suggested in your comment:
> >
> >         <dim> 2 4 </dim>
> >
> >         Similarly, proposals to flatten arrays would be discouraged
> > because they implicitly specify markup (structure).

This type of stylistic emphasis presumes a great deal about the type of
data being represented, and perhaps is one reason why a fundamental
structure like arrays is not yet fully supported in schema attributes.

When to use tags and when to use attributes in tagset design has
multiple simultaneous tradeoffs.  Letting this single case (i.e.
specifying arrays for base datatypes) always drive every schema/DTD 
towards tags alone preempts all the other design priorities, and thus
is not a balanced solution.

The basic disconnect for us is that flattened arrays are essential when
using huge numbers of floats.  This often occurs in VRML, and some
other formats.  Recently I generated some terrain that had triangle
vertices of several megabytes in a single attribute.  There is 
significant authoring benefit to flagging an error that 300,001 values 
were contained in the attribute, rather than 300,000 or 300,003.

Wondering if it would be good to capture this sentiment as a design
principle:  schema constraints on attribute definitions need to have 
equivalent power as schema constraints on tag definitions.

Also thinking that the schema recommendation ought to explicitly
list all cases where attribute power doesn't equal tag power.
Again, such limitations are fundamental to schema design, so
designers should not have to painfully deduce or discover such
mismatches.  Does such a list of mismatches exist?

> > Thus detailed mark up syntax:
> >
> >         <array>
> >                 <arrayElement> 1.0 </arrayElement>
> >                 <arrayElement> 2.0 </arrayElement>
> >                 <arrayElement> 3.0 </arrayElement>
> >                 <arrayElement> 4.0 </arrayElement>
> >         </array>
> >
> > would be preferred to the flattened syntax you suggested
> > in your comment:
> >
> >         <array> 1.0 2.0 3.0 4.0 </array>

but for numeric data of big sizes, the detailed approach isn't really 
feasible.  the relationship of compression is understood, but that isn't a
complete solution since text-editing or text-searching must also be
feasible.

> > The flattened syntax is similar to the array syntax of XSIL.
> > Observe that the detailed mark up syntax is easier to extend
> > to nested arrays.  It is also easier to process in XSLT.
> >
> > [Note that these points represent F. Olken's
> > interpretation of the sentiment of the Schema WG. ]
> >
> >
> > To summarize the position of the Schema WG is:
> >
> >         1) arrays as simple datatypes - not now, not ever.
> >         2) arrays as complex datatype constructors - not now

support for choice 2) is definitely desiraable for VRML

> > Subsequent to the decisions of the Schema WG, a new XML Protocol
> > WG (URL: http://www.w3.org/2000/xp/) has been chartered by the W3C.
> > It will meet later this month.  David Fallside (IBM) (email:
> > fallside@us.ibm.com) is the chair of the new WG. He has stated
> > that this WG will likely take up the issue of specifying arrays, because
> > this is needed by RPC protocols (e.g., SOAP) which permit the
> > transmission of arrays.  See the W3C note on SOAP
> > (URL: http://www.w3.org/TR/SOAP/ ) Section 5.4.2. on Arrays.
> > Arrays would thus initially emerge in the
> > XML Protocols Requirements Document.  Hopefully, the Protocol WG array
> > efforts will be coordinated with the Schema WG, e.g, perhaps as
> > part of Schema Version 1.1.

That is encouraging and important, thanks.  I've taken the liberty of
adding him as a cc: on this reply.

> > Is this response adequate ?
> > ------------------------------

The response per se is very helpful, thank you.
The proposed result is not very helpful, unfortunately.
[After all this fantastic work on schemas, it seems pretty amazing 
 (to me, anyway) that we're still trying to figure out arrays!  Oh well.]

A big possibility has not been mentioned at all here, though.  Since there
is schema support for regular expressions, I expect that regexes might solve
the issue of providing datatype checking for the 2-tuple and 3-tuple
(and actually even 4-tuples for orientation) that we need.  Regular
expressions could also preserve some of the syntactic sugar peculiar
to VRML, such as treating commas as whitespace inside big arrays as a 
readability assist.

I expect we'll have a draft X3D schema sometime this month to test this.
Whether or not emerging software tools for schema are robust enough
to handle regular expressions satisfactorily remains an unknown quantity.

If there is any other effort trying to implement array support for
arrays of simple types (e.g. an array of 3-tuple floats) via regexes
or other mechanisms, please let us know.  

If such an approach does prove to be feasible, I recommend that you
add examples demonstrating such regular expressions to the final draft 
of the 1.0 recommendation.

As mentioned in the summary, i expect that we will implement and 
evaluate before responding as a group.  of the possible W3C responses,
regardless of whether regex tuple-checking works or not, I expect that

[...]
> > 3)  "LATER - VERSION 1.1"  - You are not happy with the response,
> > but are prepared to defer reconsideration until XML Schema Lang.
> > Version 1.1 is drafted.  It is anticipated (hoped) that Version 1.1
> > will be completed by mid-2001.  Version 1.1 is intended primarily
> > to fix small issues needed by other W3C Working Groups to proceed
> > with their work (especially XML Query Language).  You request that
> > your comments be reconsidered when drafting the Version 1.1
> > requirements document.

will be our response.  There are major benefits to using Schema that we
need in the short term.  Lack of this particular array capability doesn't 
break any of our content, it just makes the constraint checking on attribute
data less rigorous and less consistent.

Again thanks the response.  I hope I've understood your responses
satisfactorily.  Further questions/comments welcome.

all the best, Don
-- 
Don Brutzman  Naval Postgraduate School, Code UW/Br Root 200  work 831.656.2149
              Monterey California 93943-5000 USA              fax  831.656.3679
Virtual worlds/underwater robots/Internet     http://web.nps.navy.mil/~brutzman

Received on Thursday, 5 October 2000 03:46:19 UTC