Re: question about IRI spec

Jeremy Carroll scripsit:

> On A, could someone articulate that in an automatable fashion please.
> e.g. use such and such a table from unicode.org, and for each IRI 
> component map each character to its script code, and then the component 
> is OK if the set of script codes used is either a singleton set, or the 
> set { hiragana, kanji, katakana } or { ... }.

See http://www.unicode.org/Public/UNIDATA/Scripts.txt for the mapping, and
note the special status of Common and Inherited, which may be mixed with
any script.

There are a few other languages besides Japanese which require script-mixing,
notably Kurdish (in Cyrillic/Latin script) and Wakhi.

-- 
As we all know, civil libertarians are not      John Cowan
the friskiest group around -- comes from        cowan@ccil.org
forever being on the qui vive for the sound     http://www.ccil.org/~cowan
of jack-booted fascism coming down the pike.           --Molly Ivins

Received on Wednesday, 4 January 2006 17:43:29 UTC