[Bug 5732] Provide a simplified syntax for XSD 1.1

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5732


Paolo Marinelli <pmarinel@cs.unibo.it> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pmarinel@cs.unibo.it




--- Comment #1 from Paolo Marinelli <pmarinel@cs.unibo.it>  2008-09-19 23:58:58 ---
[This message was originally sent by me to the XML Schema Interest Group 
(http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2008Sep/0022.html). But
as also suggested during the last telecon, it is better to copy it here in
order to make it public. Michael Kay's reply will be copied as an additional
comment to this bug
(http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2008Sep/0023.html)]

One of the points of criticism about XSD concerns its syntax. In particular,
the normative XML-based syntax of schema documents is often considered too
verbose, and it is argued that as such it has a low degree of
human-readability. As pointed out by Erike Wilde, development tools provide
mechanisms to ease the creation of XSD schema documents. In particular, such
mechanisms typically consist of graphical interfaces hiding the underlying
syntax and allowing to work with visual objects representing XSD components.
The problem is that each tool provides its own graphical interface. And thus
schema authors are required to learn new interfaces each time they switch to a
different tool [1].

A completely different approach is represented by the development of XSD
alternative syntaxes. In the literature there are some proposals going on that
direction: XSCS (XML Schema Compact Syntax) [2], DTD++ 2.0 [3] and Extended DTD
[4]. While XSCS and DTD++ has been proposed as alternative syntaxes for XML
Schema, the aims of Extended DTD is to improve the expressivity of DTD and not
to be an alternative notation for XML Schema. For this reason here we will not
investigate it further.

Both XSCS and DTD++ are non-XML syntaxes, as the verbosity of the normative XSD
syntax is at least in part due to the fact of being XML-based. However, neither
XSCS nor DTD++ 2.0 are up-to-date to version 1.1 of XSD. Indeed, XSCS was
designed as a compact syntax for XSD 1.0, while DTD++ 2.0 for SchemaPath (an
extension of XSD 1.0 adding constructs for conditional type assignments in
element declarations). Consequently, neither XSCS nor DTD++ has support for the
new features introduced by XSD 1.1, e.g., assertions, types alternatives, open
content, versioning, and so on. Thus, as such they cannot be directly used as
alternative syntaxes for XSD 1.1.

The aim of this mail is not to propose a new alternative syntax for XML Schema
1.1 but rather to promote a discussion about the requirements that an
alternative notation for XSD 1.1 should meet.

So here is a proposal of requirement list I wrote after some discussions within
our research team here in Bologna and led by Fabio Vitali.

1. The new syntax should be at least partially non-XML. The normative XSD
syntax has been designed to grasp every feature of the XSD model. So attempts
to find alternative full-XML syntaxes would likely end up with notations very
similar to the normative one (unless the new syntax is aimed at covering only a
limited subset of the XSD features).

2. The new syntax should be defined in terms of the XSD model. The compact
syntax for XSD 1.1 should not be designed merely as a non-XML reformulation of
the normative notation, but as a way to reformulate the conceptual model.
Nonetheless, the compact syntax should express all the XSD semantics. In order
to be considered a reasonable alternative to the normative syntax,  a compact
syntax for XSD should be able to represent all the features of XSD, either in a
completely compact form, or (as for example happens with XPath), with a
combination of compact and non-compact forms

3. The new syntax should be as close as possible to DTD. DTDs represented the
official syntax for XML schemas for many years (before in SGML, and then in
XML) and they are still widely used. Also when someone want to write by hand
the content model of an XSD complex type on a sheet of paper, he/she probably
resort to a DTD syntax. So we believe that the DTD syntax is widely known in
the community of schema authors and consequently we believe it is worth
designing a notation where the constructs are defined where possible using the
conventions adopted by DTDs for similar purposes.

4. The new syntax should be DTD-compatible. Every DTD should be a legal schema
in the new syntax. On the one hand, this requirement can be seen as a
strengthening of the previous point. But on the other hand, jointly with the
second requirement it also has a practical advantage: every DTD can be parsed
into an XSD model without the need of intermediate conversion tools.

5. The new syntax should follow a flat structure. XML straightforwardly allows
to represent nested structures. The XSD normative syntax makes use of this
capability in a number of situations, the most obvious of which are anonymous
type definitions and local declarations. We say that if a notation allows to
"natively" represent nested structures then it has a deep structure. For
instance, the RELAX NG compact syntax has a deep structure: each component is
delimited by a pair of open and closed curly brackets.  On the other hand, DTDs
does not have a deep structure. Indeed, roughly speaking, a DTD mainly is a
sequence of element declarations and it is not possible to nest element
declarations within content model definitions. Thus we say that DTD has a flat
structure. From the requirements 3 and 4 (similarity and compatibility with
DTDs), it follows that the new syntax for XSD should adopt a flat structure.
But it is not the only reason why we are in favour of a flat structure. Indeed,
we believe that a notation designed on the same lines of the RELAX NG compact
syntax would result just in a re-encoding of XML without tags.

6. The new syntax may provide an escaping mechanism. The similarity to DTDs is
a requirement we place. At the same time, we recognize that representing all
the XSD features in a DTD-like notation might end up with an involved syntax.
For this reason we believe that it could be useful to provide the possibility
to express some XSD features in the normative XML representation. Clearly, the
majority of the XSD features should be expressed in compact syntax, and only
subtle aspects should require the use of the normative notation. In any case,
we believe that the alternative syntax should at least support the following
XSD features without resorting to the XML representation:
  a. Element and attribute declarations
  b. Simple and complex type definitions
  c. Attribute and model group definitions
  d. Derivations (in all forms)

With those design issues in mind, we are developing a compact syntax for XSD
1.1 and based on the work we already did in the context of DTD++ 2.0. Although
we don't have an official name yet, here we call such a new syntax DTD++ 3.0.
The work is still in progress, but we believe DTD++ 3.0 has reached an
assessable state.

However we are not interested here in presenting DTD++ 3.0. We think that for
the moment the discussion should focus on the requirements listed above.

Regards,
Paolo Marinelli


REFERENCES

[1] Wilde, Erik. A Compact Syntax for W3C XML Schema. XML.com: XML From the
Inside Out. August 27, 2003. http://www.xml.com/pub/a/2003/08/27/xscs.html.

[2] A Compact XML Schema Syntax. Wilde, Erik and Sitllhard, Kilian. London, UK.
2003. XML Europe 2003.

[3] DTD++ 2.0: adding support for co-constraints. Fiorello, Davide, et al.
Montreal, Quebec. 2004. Extreme Markup Languages 2004.

[4] Making DTD a Truly Powerful Schema Language. Wei, Shan and Liu, Mengchi.
April 1, 2005, APWeb, pp. 333-338.


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Friday, 19 September 2008 23:59:34 UTC