W3C home > Mailing lists > Public > www-international@w3.org > January to March 2006

Re: question about IRI spec

From: John Cowan <cowan@ccil.org>
Date: Wed, 4 Jan 2006 12:43:11 -0500
To: Jeremy Carroll <jjc@hpl.hp.com>
Cc: "www-international@w3.org" <www-international@w3.org>
Message-ID: <20060104174311.GK26883@ccil.org>

Jeremy Carroll scripsit:

> On A, could someone articulate that in an automatable fashion please.
> e.g. use such and such a table from unicode.org, and for each IRI 
> component map each character to its script code, and then the component 
> is OK if the set of script codes used is either a singleton set, or the 
> set { hiragana, kanji, katakana } or { ... }.

See http://www.unicode.org/Public/UNIDATA/Scripts.txt for the mapping, and
note the special status of Common and Inherited, which may be mixed with
any script.

There are a few other languages besides Japanese which require script-mixing,
notably Kurdish (in Cyrillic/Latin script) and Wakhi.

-- 
As we all know, civil libertarians are not      John Cowan
the friskiest group around -- comes from        cowan@ccil.org
forever being on the qui vive for the sound     http://www.ccil.org/~cowan
of jack-booted fascism coming down the pike.           --Molly Ivins
Received on Wednesday, 4 January 2006 17:43:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:06 GMT