- From: Charles McCathieNevile <chaals@opera.com>
- Date: Wed, 13 Dec 2006 20:28:09 +0530
On Wed, 13 Dec 2006 19:43:10 +0530, Mikko Rantalainen <mikko.rantalainen at peda.net> wrote: > Charles McCathieNevile wrote: >> On Wed, 13 Dec 2006 13:17:14 +0530, Henri Sivonen <hsivonen at iki.fi> >> wrote: >>> On Dec 13, 2006, at 08:32, Charles McCathieNevile wrote: >>>> possible *and no simpler* - this is too simple. Maybe assuming you >>>> can parse numbers out of text is just a dumb idea as a normative >>>> part of a spec. >>> The attributes always work for any language. For English, the >>> textContent works as a *bonus*. It isn't that the spec fails to work >>> for non-English. It is just that a particular *redundant* bonus >>> feature doesn't work for non-English. >> The problem with this is that it means copying code the natural way >> doesn't work for some non-english speakers, and they have to read the >> spec or guess why. [...] > > I think that "they have to read the spec" is a bonus, too. Yeah, except it turns out to be wishful thinking of the kind WHATWG tries strenuously to avoid :( And where the problem is that people who habitually use conventions for numbers, it turns out that many of them don't really read english documents or mailing lists either... > Perhaps the parser could be specified as follows: > > regexp for "numeric value" is [0-9 ,.] > scan the numeric value backwards from end > first character matching regexp [,.] is the decimal separator > > This would correctly interpret numbers such as > > 1,251,152.124 > 634.46 > 453.436.346,235 This last is the important use case that the existing method fails. > 23 236 435 123,121 > > It would fail for numbers such as > > 1,234,456.789,012 > 1.234.456,789.012 > > but that such formats used in any locale? Not that I know of. Formats I know of use ".", "," or " " as seperators for integer amounts, and "," or "." for decimal seperators. The only seperators I know of inside the decimal part are "-", "e" and "E". I can imagine someone using the notation for web content in a meter, but I am not sure that it is likely. Of course there are a handful of other types of numbers. One thing that is helpful is that in hebrew and arabic, numbers are written LTR even though the rest of the text isn't. I am not sure about other LTR languages - apparently there are a couple of Indic ones. On the other hand, since I am going to meet a handful of people this weekend who specialise in publishing for the Indian government, in at least their 22 constitutionally official languages, I will try to remember to ask. One thing that is unhelpful is that in some languages numbers are written using ordinary letters. Although I suspect this use is very rare on the web, as I believe it is pretty much archaic in the relevant languages. This is, of course, going down the path of specifying internationalised number picking - something that some people are ust dead against. cheers Chaals -- Charles McCathieNevile, Opera Software: Standards Group hablo espa?ol - je parle fran?ais - jeg l?rer norsk chaals at opera.com Try Opera 9 now! http://opera.com
Received on Wednesday, 13 December 2006 06:58:09 UTC