- From: Martin J. Duerst <duerst@w3.org>
- Date: Thu, 13 Jul 2000 14:36:56 -0600
- To: www-xml-schema-comments@w3.org
Dear Schema WG,
[This mail is crossposted to the I18N IG to allow for further discussion.
Please feel free to forward these comments to another list, including
a public list, but please make sure that you don't reveal the mail
addresses of the various groups.]
This are the last call comments on XML Schema Part 0: Primer
from the I18N WG/IG.
These comments don't discuss changes that may have to be done to
address changes in Parts 1/2 as a consequence of i18n comments to
those two specs.
[1] The I18N WG/IG is very pleased that internationalization (i18n)
is used as an example to show some core concepts. The comments
we make below should not lead to changing to another example
domain.
[2] However, the examples chosen give a very inappropriate impression
of i18n. I18n is not the extension of an US solution to the UK.
This can easily be corrected, and should be corrected. More details
mainly in comments [5], [10], but also [3], [4], [8].
[3] All examples have to use xml:lang for every piece of
readable text, e.g. at least all things such as product names,
comments,..., and all elements that contain the formentioned
kind of elements. xml:lang can only be avoided on things like
date, price, quantity. This applies both to schemas and instances.
[4] Addresses are flagged from the start with "country='US'".
Prices also have to be flagged from the start with a currency
to show good practice. Also, 'weight' should either have
a comment indicating which metric unit this is, or have an
attribute that gives or fixes the (hopefully metric) unit.
[5] Names of elements should be choosen carefully from the beginning.
If an element has an attribute 'country' fixed to 'US', then it
should from the start be called 'USAddress'. There are a lot of
reasons for this, from making sure people know what kind of type
the name refers to both short-time and long-time to make sure
the naming is 'politically correct'.
[6] In the examples in 2.3, show that and where datatypes can be
international. In particular, include e.g. an accented word in
the 'string' example. Where such an example is used (e.g. Table
C1), it should use the mechanisms of HTML/XML and display the
actual character (in the right column at least).
[7] Notations such as [A-Z]{2} in regexp may work in that specific
case, but 'two upper-case letters' is not correct (there are
many more upper-case letters, including accented Latin ones,
Greek and Cyrillic ones,...), it should be 'two ASCII-only
upper-case letters' or something similar, and should
say that in the general case, letter categories or lists of
letters should be used.
[8] Currency codes should use standards, i.e. EUR and not just
EU. (Section 2.5). [sorry, don't remember the ISO standard number]
Of course each schema can do what it wants, but W3C examples
should conform to good practice unless there is a specific
point in diverging.
[9] At the end of sect 2.5, there is an alternative given between
'any' and 'string'. The choice and the i18n consequences have to
be explained clearly. [see our Schema comments on this issue]
[10] There is more to addresses than US addresses and UK addresses.
As examples, Japanese addresses are completely block-based;
there are not many street names, and street names don't turn
up in addresses. In Singapore (and Hong Kong), the city field
is superfluous, because it's the same as the country field.
So the 'IPO' schema has to be adapted.
[11] The IPO schema suggests to include all the derivations for
the various countries. This will lead to a very long file.
Some kind of division into files,... should be used or at
least should be suggested in the text.
[12] The first paragraph of sect. 2.4 should say that it may be
important to use explicit types to help later modification.
Regards, Martin.
Received on Thursday, 13 July 2000 17:01:10 UTC