- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 16 May 2007 19:20:07 +0900
- To: Eric Prud'hommeaux <eric@w3.org>
- CC: public-i18n-core@w3.org
Hi Eric,
Eric Prud'hommeaux wrote:
> There's a sorting example in the editor's draft of SPARQL Query
> http://www.w3.org/2001/sw/DataAccess/rq23/rq25#modOrderBy
> RDF Term Reason
> Unbound results sort earliest.
> _:z Blank nodes follow unbound.
> _:a There is no relative ordering of blank
> nodes.
> <http://script.example/Latin> IRIs follow blank nodes.
> <http://script.example/Кириллица> The character in the 23rd position,
> "К", has a unicode codepoint 0x41A,
> which is higher than 0x4C ("L").
> <http://script.example/日本語> The character in the 23rd position,
> "æ—¥",has a unicode codepoint 0x65E5,
> which is higher than 0x41A ("К").
> "http://script.example/Latin" Simple literals follow IRIs.
> "http://script.example/Latin"^^xsd:string xsd:strings follow simple literals.
>
> which is meant to illustrate codepoint ordering of IRIs. I don't
> believe 日本語 identifies a script. Should I use æ¼¢å— (0x6F22) instead?
> Is Кириллица correct? Would something from Byzantine Musical
> Symbols speak to a wider audience?
>
If you need script identifiers, look first at
http://www.iana.org/assignments/language-subtag-registry . You will find:
%%
Type: script
Subtag: Jpan
Description: Japanese (alias for Han + Hiragana + Katakana)
Added: 2006-07-21
If you need a non-latin (i.e. localized) identifier of scripts, you
might look into CLDR, see http://unicode.org/cldr/repository_access.html
and http://unicode.org/Public/cldr/1.4.1/core.zip . The CLDR data has
localized versions of script identifiers, e.g. <script type="Latn">ラテ
ン文å—</script> for the Latin script identified in the Japanese locale.
There is no localized identifier yet for "Jpan" , though.
Felix
Received on Wednesday, 16 May 2007 10:20:26 UTC