- From: Martin Duerst <duerst@w3.org>
- Date: Thu, 07 Mar 2002 16:48:55 +0900
- To: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>, "Pat Hayes" <phayes@ai.uwf.edu>, "Patrick Stickler" <patrick.stickler@nokia.com>
- Cc: <w3c-rdfcore-wg@w3.org>, <w3c-i18n-ig@w3.org>
Hello Jeremy, The situation is a little bit different. See below. At 18:41 02/03/01 +0000, Jeremy Carroll wrote: >I am copying this to i18n-ig. > >Pat Hayes: > > > > But are they? We have considered German versus English decimals and > > US versus UK date formats within the datatyping discussion. Seems to > > me that datatyping and language tagging are going to be seen as > > closely related in such cases. In the context of xsd:decimal, does > > "10,03"-fi equal "10.03"-en ? >I discussed this briefly with Misha and Martin (of the I18N-WG) at the >plenary. > >I asked why they didn't require XML Schema to support language tagging of >numbers (as in Pat's example). In some sense, we would have liked to do that, but that would have led to a large number of problems that would have complicated and delayed the XML Schema spec much more than worth while. In particular: - Going only as far as 'this is a French date' would have been easy, but not very useful (no validation, no conversion). - For more useful operations (lexical validation, conversion), one of the following solutions would have had to be chosen (maybe differently per datatype): - Predefine the types for a number of languages (excluding all the other languages, and choosing a format for a specific language where maybe more than one exists) - Leaving definitions implementation-defined (leading to various interoperability problems, such as different lanuages supported by different implementations, and differences in definitions for the same language, or use of more than one format for the same datatype in the same language) - Defining new mechanisms to allow users to specify their idea e.g. of a French date (for lexical validation, patterns can go a very long way here, but for conversion, more is needed, and often the needs go up to full programming languages). >Their take was that this is an input software localization problem, rather >than a document format internationalization problem. > >i.e. software running in a german locale should display and accept 10,03 as >a number a little more than ten. However, when communicating that number >with other software (even in the same locale, and even in a markup document >that may have human readers) it should use the US form of the number. > >It surprised me. I don't think we said 'markup document', e.g. in the sense of HTML. This would surprise me, too. For documents that already contain actual text that corresponds to some data value, the usual solution is to provide both a formatted, processable (XML-Schema) value as well as the actual text (which depending on the document class might be checked with a pattern). An example would be: <date value='2002-03-11'>next Monday<date> This is the form that would be used in the scenario of 'marking up a pre-existing document'. For RDF, I can think about several different ways to express this as a graph, but I'm sure there are other people who are better at this. >But in my book the I18N people get to decide this. >Looks like this example is out of scope. Yes indeed, we would have to agree that "10,03"-fi equal? "10.03"-en is out of scope, but not because the I18N WG/IG requires all XML documents to use XML Schema formatted datatypes only, but because any spec that would want to define the above equivalences for a significant number of languages and datatypes in an interoperable and user-acceptable way might take years or more. Regards, Martin.
Received on Thursday, 7 March 2002 04:53:39 UTC