- From: Martin J. Duerst <duerst@w3.org>
- Date: Thu, 13 Jul 2000 14:36:56 -0600
- To: www-xml-schema-comments@w3.org
Dear Schema WG, [This mail is crossposted to the I18N IG to allow for further discussion. Please feel free to forward these comments to another list, including a public list, but please make sure that you don't reveal the mail addresses of the various groups.] This are the last call comments on XML Schema Part 0: Primer from the I18N WG/IG. These comments don't discuss changes that may have to be done to address changes in Parts 1/2 as a consequence of i18n comments to those two specs. [1] The I18N WG/IG is very pleased that internationalization (i18n) is used as an example to show some core concepts. The comments we make below should not lead to changing to another example domain. [2] However, the examples chosen give a very inappropriate impression of i18n. I18n is not the extension of an US solution to the UK. This can easily be corrected, and should be corrected. More details mainly in comments [5], [10], but also [3], [4], [8]. [3] All examples have to use xml:lang for every piece of readable text, e.g. at least all things such as product names, comments,..., and all elements that contain the formentioned kind of elements. xml:lang can only be avoided on things like date, price, quantity. This applies both to schemas and instances. [4] Addresses are flagged from the start with "country='US'". Prices also have to be flagged from the start with a currency to show good practice. Also, 'weight' should either have a comment indicating which metric unit this is, or have an attribute that gives or fixes the (hopefully metric) unit. [5] Names of elements should be choosen carefully from the beginning. If an element has an attribute 'country' fixed to 'US', then it should from the start be called 'USAddress'. There are a lot of reasons for this, from making sure people know what kind of type the name refers to both short-time and long-time to make sure the naming is 'politically correct'. [6] In the examples in 2.3, show that and where datatypes can be international. In particular, include e.g. an accented word in the 'string' example. Where such an example is used (e.g. Table C1), it should use the mechanisms of HTML/XML and display the actual character (in the right column at least). [7] Notations such as [A-Z]{2} in regexp may work in that specific case, but 'two upper-case letters' is not correct (there are many more upper-case letters, including accented Latin ones, Greek and Cyrillic ones,...), it should be 'two ASCII-only upper-case letters' or something similar, and should say that in the general case, letter categories or lists of letters should be used. [8] Currency codes should use standards, i.e. EUR and not just EU. (Section 2.5). [sorry, don't remember the ISO standard number] Of course each schema can do what it wants, but W3C examples should conform to good practice unless there is a specific point in diverging. [9] At the end of sect 2.5, there is an alternative given between 'any' and 'string'. The choice and the i18n consequences have to be explained clearly. [see our Schema comments on this issue] [10] There is more to addresses than US addresses and UK addresses. As examples, Japanese addresses are completely block-based; there are not many street names, and street names don't turn up in addresses. In Singapore (and Hong Kong), the city field is superfluous, because it's the same as the country field. So the 'IPO' schema has to be adapted. [11] The IPO schema suggests to include all the derivations for the various countries. This will lead to a very long file. Some kind of division into files,... should be used or at least should be suggested in the text. [12] The first paragraph of sect. 2.4 should say that it may be important to use explicit types to help later modification. Regards, Martin.
Received on Thursday, 13 July 2000 17:01:10 UTC