- From: John Cowan <cowan@ccil.org>
- Date: Wed, 4 Jan 2006 12:43:11 -0500
- To: Jeremy Carroll <jjc@hpl.hp.com>
- Cc: "www-international@w3.org" <www-international@w3.org>
Jeremy Carroll scripsit:
> On A, could someone articulate that in an automatable fashion please.
> e.g. use such and such a table from unicode.org, and for each IRI
> component map each character to its script code, and then the component
> is OK if the set of script codes used is either a singleton set, or the
> set { hiragana, kanji, katakana } or { ... }.
See http://www.unicode.org/Public/UNIDATA/Scripts.txt for the mapping, and
note the special status of Common and Inherited, which may be mixed with
any script.
There are a few other languages besides Japanese which require script-mixing,
notably Kurdish (in Cyrillic/Latin script) and Wakhi.
--
As we all know, civil libertarians are not John Cowan
the friskiest group around -- comes from cowan@ccil.org
forever being on the qui vive for the sound http://www.ccil.org/~cowan
of jack-booted fascism coming down the pike. --Molly Ivins
Received on Wednesday, 4 January 2006 17:43:29 UTC