- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 16 May 2007 19:20:07 +0900
- To: Eric Prud'hommeaux <eric@w3.org>
- CC: public-i18n-core@w3.org
Hi Eric, Eric Prud'hommeaux wrote: > There's a sorting example in the editor's draft of SPARQL Query > http://www.w3.org/2001/sw/DataAccess/rq23/rq25#modOrderBy > RDF Term Reason > Unbound results sort earliest. > _:z Blank nodes follow unbound. > _:a There is no relative ordering of blank > nodes. > <http://script.example/Latin> IRIs follow blank nodes. > <http://script.example/Кириллица> The character in the 23rd position, > "К", has a unicode codepoint 0x41A, > which is higher than 0x4C ("L"). > <http://script.example/日本語> The character in the 23rd position, > "æ—¥",has a unicode codepoint 0x65E5, > which is higher than 0x41A ("К"). > "http://script.example/Latin" Simple literals follow IRIs. > "http://script.example/Latin"^^xsd:string xsd:strings follow simple literals. > > which is meant to illustrate codepoint ordering of IRIs. I don't > believe 日本語 identifies a script. Should I use æ¼¢å— (0x6F22) instead? > Is Кириллица correct? Would something from Byzantine Musical > Symbols speak to a wider audience? > If you need script identifiers, look first at http://www.iana.org/assignments/language-subtag-registry . You will find: %% Type: script Subtag: Jpan Description: Japanese (alias for Han + Hiragana + Katakana) Added: 2006-07-21 If you need a non-latin (i.e. localized) identifier of scripts, you might look into CLDR, see http://unicode.org/cldr/repository_access.html and http://unicode.org/Public/cldr/1.4.1/core.zip . The CLDR data has localized versions of script identifiers, e.g. <script type="Latn">ラテ ン文å—</script> for the Latin script identified in the Japanese locale. There is no localized identifier yet for "Jpan" , though. Felix
Received on Wednesday, 16 May 2007 10:20:26 UTC