LC-219 laying the foundations for better i18n and l10n

Dear Martin and Misha:

The W3C XML Schema Working Group has spent the last several months
working through the comments received from the public on the last-call
draft of the XML Schema specification.  We thank you for the comments
you made on our specification during our last-call comment period, and
want to make sure you know that all comments received during the
last-call comment period have been recorded in our last-call issues
list (http://www.w3.org/2000/05/12-xmlschema-lcissues).

Among other issues, you raised the point registered as issue LC-219,
which suggests that the XML Schema specification be modified in
various way to lay the groundwork for supporting (in some future
version) the definition of locale-dependent datatypes.

You will be disappointed to learn that there is no significant change
to report between the last-call draft upon which you commented and the
draft we are proposing to request be published as a Candidate
Recommendation in this area.

In particular, the XML Schema WG was not persuaded to share your
position that if possible the transfer syntax for all simple types
should be made unlike the format used in any locale; if there is a
reason you are persuaded that rebarbative lexical forms are helpful to
the short- or long-term interests of human beings using computers, you
did not manage to make us perceive it or to lead us to reach that
conclusion with you.

Some members of the WG feel that the analysis of the situation
reflected in your suggestions relies on the false assumption that XML
Schema datatypes are useful and will be used primarily in the
presentation and processing of data in databases or forms, and not in
the definition of schemas for the encoding of textual documents.  In
document-based applications, it is essential to be able to define
types with locale-specific lexical forms, but it would be wholly wrong
to translate these into some locale-independent lexical form for
transmission.

The XML Schema WG did consider, at some length, a proposal intended to
make it possible, in the long run, to provide the kinds of
functionality you mention in the portions of your comments included
under this issue: definition of simple types with the same value
spaces but different (e.g. locale-dependent) lexical spaces,
specification of the mapping between types with variant lexical spaces
and the canonical lexical form for a value of a given simple type,
specification of the relationship between items with locale-dependent
or non-standard lexical forms and items with a standard lexical form
denoting an equivalent value, etc.  This was a proposal to define
abstract simple types corresponding to the built-in simple types of
the XML Schema spec, and to allow schema authors to derive concrete
types from those abstract types.  This or some similar mechanism would
make it possible to begin to describe the relationship among lexical
forms like "1000", "1,000", "1.000", and "1 000", or among "1 Dec
1900", "1900-12-01", "1.12.1900", etc.  (It is plausible to suppose
that some very different mechanism might provide a different but
equally effective handle on this problem, but the proposal for
abstract types is the only proposal anyone has put on the table for
inclusion in XML Schema.  If you have other ideas, please share them.)

You may remember this proposal.  You argued strongly against it; in
the long run your position prevailed and the proposal, having been
accepted, was later backed out of the spec.  Given the failure of that
proposal, it is hard to say what the XML Schema specification could do
to lay the groundwork for better support, in the future, for
internationalization and localization.

In conclusion, I believe that you are essentially correct that
progress in this area requires better communication and coordination
among WGs.  The positions taken by the I18n WG in its last-call
comments on XML Schema, and in your comments on issues raised by
others, have thoroughly perplexed and dismayed many WG members
interested in better support for diverse languages and cultures.  It
is clear that we need to try to communicate and to understand each
other better.

It would seem almost sarcastic to ask, after the report above, in the
usual way "whether you are satisfied with the decision taken by the WG
on this issue"; I will instead ask merely if you wish your dissent
from the WG's decisions in this area to be recorded for consideration
by the Director of the W3C.

with best regards,

-C. M. Sperberg-McQueen
  World Wide Web Consortium
  Co-chair, W3C XML Schema WG

Received on Thursday, 5 October 2000 21:50:14 UTC