W3C home > Mailing lists > Public > xmlschema-dev@w3.org > June 2001

Re: ASN.1 => XML Schema questions

From: Martin Duerst <duerst@w3.org>
Date: Wed, 27 Jun 2001 12:01:37 +0900
Message-Id: <4.2.0.58.J.20010627115246.03d7fc50@sh.w3.mag.keio.ac.jp>
To: Noah_Mendelsohn@lotus.com, Dan Connolly <connolly@w3.org>
Cc: elgey@dstc.qut.edu.au, kohsukekawaguchi@yahoo.com, xmlschema-dev@w3.org
Hello Noah,

At 20:31 01/06/26 -0400, Noah_Mendelsohn@lotus.com wrote:
>As to the reason for limitations on <all>.  My recollection is that
>certain members of the workgroup claimed that experiences with SGML had
>shown that support for generalized <all>  would be complex and expensive.
>Perhaps this was due to the general wish to use DFA technology, perhaps
>for more subtle reasons.

I have heard DFA (deterministic finite automata) claimed as a reason
for this, too, but it is definitely unrelated. I haven't seen any
'more subtle reason' clearly spelled out.

DFAs are a very bad match to any kind of <all> construct, starting
with the very simple one where every element has to be present once,
continuing with the design currently in XML Schema, and up to the
general design that allows arbitrary occurrence constraints for
each element. For the two first cases, a DFA for an <all> construct
with n elements grows 2**n (2 to the power of n). Thus using a DFA
to implement an <all> group with 20 children needs a few megabytes;
an <all> group with 40 children needs a few terrabytes.
On the other hand, it is very easy to implement <all> groups if
one does not insist on DFA; a simple bit vector will deal with the
design currently in XML Schema; for a more general design, you need
a vector of integers. With the same amout of memory that a DFA needs
to handle 20 children, a vector of integers can handle a million or
more children.

Regards,   Martin.


>The initial proposal was therefore to stick
>with the XML 1.0 decision and provide no such support for the so-called
>"AND connector".  Some of us pointed out that a limited form of <all>,
>allowing only elements, is a particularly good match to database columns,
>programming language structures, and other systems in which fields must be
>unique and named, but not ordered.  The limited version of <all> was
>introduced primarily as an 80/20 trade-off to meet an interesting fraction
>of this need.  Implementation of the limited <all> in schema is very
>simple.  Later, the argument was made that allowing minOccurs=0 (I.e.
>Optional elements in an <all> group) added only modest complexity and was
>a significant value.  It also has what I believe to be the appealing
>property that it is symmetric with the rules for attributes:  at most once
>each of the named elements, with some being required if you want.
>
>So, that is my unofficial recollection of how the design came to be as it
>is.
>
>------------------------------------------------------------------------
>Noah Mendelsohn                                    Voice: 1-617-693-4036
>Lotus Development Corp.                            Fax: 1-617-693-8676
>One Rogers Street
>Cambridge, MA 02142
>------------------------------------------------------------------------
>
>
Received on Wednesday, 27 June 2001 01:25:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:21 GMT