Jeremy Carroll scripsit: > On A, could someone articulate that in an automatable fashion please. > e.g. use such and such a table from unicode.org, and for each IRI > component map each character to its script code, and then the component > is OK if the set of script codes used is either a singleton set, or the > set { hiragana, kanji, katakana } or { ... }. See http://www.unicode.org/Public/UNIDATA/Scripts.txt for the mapping, and note the special status of Common and Inherited, which may be mixed with any script. There are a few other languages besides Japanese which require script-mixing, notably Kurdish (in Cyrillic/Latin script) and Wakhi. -- As we all know, civil libertarians are not John Cowan the friskiest group around -- comes from cowan@ccil.org forever being on the qui vive for the sound http://www.ccil.org/~cowan of jack-booted fascism coming down the pike. --Molly IvinsReceived on Wednesday, 4 January 2006 17:43:29 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 20 September 2007 14:34:20 GMT