W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > January to March 2000

XML Schema syntax

From: Philip Wadler <wadler@research.bell-labs.com>
Date: Tue, 01 Feb 2000 18:05:43 -0500
Message-Id: <200002012305.SAA22281@nslocum.cs.bell-labs.com>
To: www-xml-schema-comments@w3.org, xml-dev@ic.ac.uk
Am I alone in finding the syntax of XML Schema hard to use?
I believe the syntax could be considerably improved, simply by
makeing the structure of the XML tree correspond to the structure
of the corresponding regular expression.

Consider the following DTD:

	ab?(c|d)*

In the current XML Schema syntax, this is rendered as follows:

	<group order="sequence">
	  <element name="a"/>
	  <element name="b" minOccurence="0" maxOccurrence="1"/>
	  <group order="choice" minOccurrence="0" maxOccurrence="*">
	    <element name="c"/>
	    <element name="d"/>
	  </group>
	</group>

I would argue that it would be better to use a syntax along
the following lines:

	<sequence>
	  <element name="a"/>
	  <repeat min="0" max="1">
	    <element name="b"/>
	  </repeat>
	  <repeat min="0" max="*">
	    <choice>
	      <element name="c"/>
	      <element name="d"/>
	    </choice>
	  </repeat>
	</sequence>

This is better for the following reasons.

*  The structure of the XML corresponds closely to the structure
   of the parse tree for the corresponding regular expression.
   This make it easier to read, easier to learn, and easier to
   build processors.

*  The close tags are more informative ("repeat", "choice", and
   "sequence", rather than just "group").

*  One can use XML Schema to specify subsets of XML Schema.  For example,
   for ease of processing one might want to specify a subset of XML
   Schema in which each `choice' element contains only `element'
   elements, and `repeat' element contains either a single `element'
   element or a `choice' element.  This is easy to do with the new
   syntax, but I think is impossible with the current syntax.

One could also define

   <star>...</star>	to abbreviate	<repeat min="0" max="*">...</repeat>
   <plus>...</plus>	to abbreviate	<repeat min="1" max="*">...</repeat>
   <query>...</query>	to abbreviate	<repeat min="0" max="1">...</repeat>

This would further increase readability and ease of learning.

The XML Schema group is unlikely to make a change of this magnitude
unless it is clear that it enjoys widespread support.  So if you read
this message and agree with it, please make your views known!

Cheers,  -- P

-----------------------------------------------------------------------
Philip Wadler                             wadler@research.bell-labs.com
Bell Labs, Lucent Technologies      http://www.cs.bell-labs.com/~wadler
600 Mountain Ave, room 2T-402                   office: +1 908 582 4004
Murray Hill, NJ 07974-0636                         fax: +1 908 582 5857
USA                                               home: +1 908 626 9252
-----------------------------------------------------------------------
Received on Tuesday, 1 February 2000 18:06:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:46 GMT