Re: name of the japanese script from Felix Sasaki on 2007-05-16 (public-i18n-core@w3.org from April to June 2007)

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 16 May 2007 19:20:07 +0900
To: Eric Prud'hommeaux <eric@w3.org>
CC: public-i18n-core@w3.org
Message-ID: <464ADAD7.4010205@w3.org>

Hi Eric,

Eric Prud'hommeaux wrote:
> There's a sorting example in the editor's draft of SPARQL Query
>   http://www.w3.org/2001/sw/DataAccess/rq23/rq25#modOrderBy
> RDF Term      Reason
>        Unbound results sort earliest.
> _:z       Blank nodes follow unbound.
> _:a       There is no relative ordering of blank
>          nodes.
> <http://script.example/Latin>    IRIs follow blank nodes.
> <http://script.example/Кириллиц��>   The character in the 23rd position,
>          "К", has a unicode codepoint 0x41A,
>          which is higher than 0x4C ("L").
> <http://script.example/日本語>    The character in the 23rd position,
>          "日",has a unicode codepoint 0x65E5,
>          which is higher than 0x41A ("К").
> "http://script.example/Latin"    Simple literals follow IRIs.
> "http://script.example/Latin"^^xsd:string xsd:strings follow simple literals.
>
> which is meant to illustrate codepoint ordering of IRIs. I don't
> believe 日本語 identifies a script. Should I use 漢字 (0x6F22) instead?
> Is Кириллица correct? Would something from Byzantine Musical
> Symbols speak to a wider audience?
>   

If you need script identifiers, look first at 
http://www.iana.org/assignments/language-subtag-registry . You will find:
%%
Type: script
Subtag: Jpan
Description: Japanese (alias for Han + Hiragana + Katakana)
Added: 2006-07-21

If you need a non-latin (i.e. localized) identifier of scripts, you 
might look into CLDR, see http://unicode.org/cldr/repository_access.html 
and http://unicode.org/Public/cldr/1.4.1/core.zip . The CLDR data has 
localized versions of script identifiers, e.g. <script type="Latn">ラテ 
ン文字</script> for the Latin script identified in the Japanese locale. 
There is no localized identifier yet for "Jpan" , though.

Felix

Received on Wednesday, 16 May 2007 10:20:26 UTC