- From: Leif Halvard Silli <lhs@malform.no>
- Date: Fri, 13 Feb 2009 23:18:06 +0100
- To: fantasai <fantasai.lists@inkedblade.net>
- CC: www-style@w3.org, www-international@w3.org, HÃ¥kon Wium Lie <howcome@opera.com>
fantasai 2009-02-13 20.32: > Aryeh Gregor wrote: >> >> Also, pragmatically, it would be very cumbersome to add enumeration of >> all an alphabet's letters for every language people can think up. >> You'd have to have a different list-style-type for most languages -- >> even Latin-based alphabets differ on what they think the exact set of >> letters is, and what their order is. It seems like this would greatly >> bloat the spec. > > Yeah, I think if we're going down that route we should define keywords > for the most commonly-used alphabetic orders, and introduce a functional > notation for everything else. How often do we need, e.g. upper-norwegian, > given that lists are usually less than 26 letters? > > alpha("a-z") > alpha("a-f,q-z") > alpha("do,re,mi,fa,so,la,ti") Do you use 'alpha' for "latin alphabet"? Or could alpha be used for Cyrillic as well? If you are taking your pattern from the way RegEx/GREP is working, then remember that e.g. \p{Armenian} matches any character in the Armenian block.[1] Hence e.g. alpha(armenian) could also be useful. > Unicode can fill in ranges, so unless there are a lot of scripts like > Ethiopic, where every language seems to have picked its own order for > the letters, this doesn't have to be that painful. Let's take one example: Slovak alphabet, about which Wikipedia says: "The lexicographic ordering of the Slovak alphabet is very similar to that of English": [2] alpha("a-d,dz,e-h,ch,i-z) And there are several such alphabets.[3] It can be complicated. But on the whole, what you propose here would be very good to have. I would much rather see this implemented accross UAs than e.g. "upper-norwegian". (Although I also hope that we can get more good keywords.) Btw, why did you pick "alpha"? Why not "numb"? Or do you think that e.g. pure symbols should be excluded or have another name? [1] http://en.wikipedia.org/wiki/Regex#Regular_expressions_and_Unicode [2] http://en.wikipedia.org/wiki/Slovak_alphabet [3] http://en.wikipedia.org/wiki/Latin-derived_alphabet#Extended_Latin_alphabet -- leif halvard silli
Received on Friday, 13 February 2009 22:18:50 UTC