W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > October to December 2000

LC-220 single lexical representations

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Fri, 06 Oct 2000 01:08:35 -0600
Message-Id: <>
To: "Martin J. Duerst" <duerst@w3.org>, Misha Wolf <misha.wolf@reuters.com>
Cc: W3C XML Schema Comments list <www-xml-schema-comments@w3.org>
Dear Misha and Martin:

The W3C XML Schema Working Group has spent the last several months
working through the comments received from the public on the last-call
draft of the XML Schema specification.  We thank you for the comments
you made on our specification during our last-call comment period, and
want to make sure you know that all comments received during the
last-call comment period have been recorded in our last-call issues
list (http://www.w3.org/2000/05/12-xmlschema-lcissues).

Among other issues, you raised the point registered as issue LC-220,
which suggests that XML Schema be revised to prescribe, for each value
of each simple type, at most one lexical form.

The WG discussed this point in a variety of contexts over the course
of the summer, and was not persuaded that enforcing a restriction to
single lexical representations was desirable or would have any effect
on the long-term prospects for internationalization.  Most simple
types in the spec do have single lexical representations; some do not.
In those cases where multiple lexical representations are allowed, the
legal variation (e.g. 10, 00010, and 10.0) does not seem to us either
to favor a particular culture or geographic area above any other, or
to have any particular effect on the ability of software systems to
handle the data.  Enforcing a rule against decimal numbers which omit
the decimal point and trailing zeroes, or against numbers with
non-significant leading zeroes, entails extra work for most
implementors, since these variations are already handled silently by
existing libraries for dealing with numeric data.

So, with respect, the XML Schema WG declines to make the suggested

For each simple type which allows multiple lexical representations for
the same value, the specification does define a 'canonical form' for
each value.  Applications which need, for whatever reason, to rely on
there being only one lexical form for any value of a given type may
use that canonical form.  In the long run, it is plausible to suppose
that canonical forms may play a role in providing support for multiple
lexical spaces covering the same value space (the meaning of a variant
lexical form may be defined as its mapping onto a canonical form), but
in XML Schema 1.0 the canonical forms have no special role to play;
they are merely defined for the use of applications which need them.

It would be helpful to us to know whether you are satisfied with the
decision taken by the WG on this issue, or wish your dissent from the
WG's decision to be recorded for consideration by the Director of the

with best regards,

-C. M. Sperberg-McQueen
  World Wide Web Consortium
  Co-chair, W3C XML Schema WG
Received on Thursday, 5 October 2000 21:50:14 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:08:49 UTC