- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Wed, 13 Aug 2008 14:57:32 +0200
- To: "Forms WG" <public-forms@w3.org>
- Cc: "Martin Duerst" <duerst@it.aoyama.ac.jp>
John, Martin, Forms WG, Here is my best try at a replacement text for the Script Tokens section. I admit upfront: I don't understand this stuff completely, so it has involved some guesswork on my part. I have doubts about the need for "hanja" and "kanji", but they are in the existing list, so I have left them. I don't understand why some scripts have a property value alias and others not, nor if that field is the best to choose, but it seems to be the generator for the inital list we have, so I have copied that. I wonder if we should include the aliases "japanese", and "korean". The url for the ISO spec is http://unicode.org/iso15924/iso15924-codes.html Comments? Steven ================ E.3.1 Script Tokens Script tokens provide a general indication of the set of characters that is covered by an input mode. In most cases, script tokens correspond directly to [Unicode Scripts]. However, this neither means that an input mode has to allow input for all the characters in the script, nor that an input mode is limited to only characters from that specific script. As an example, a "latin" keyboard doesn't cover all the characters in the Latin script, and includes punctuation which is not assigned to the Latin script. The script tokens that are allowed are listed in [ISO 15924], "codes for the representations of scripts". The allowable values are those listed in the column "Property Value Alias" with the underscore character (_) removed, and excluding the two values "Common", and "Unknown". At the time of writing, these values are: Arabic, Armenian, Balinese, Bengali, Bopomofo, Braille, Buginese, Buhid, CanadianAboriginal, Carian, Cherokee, Coptic, Cypriot, Cyrillic, Devanagari, Deseret, Ethiopic, Georgian, Glagolitic, Gothic, Greek, Gujarati, Gurmukhi, Hangul, Han, Hanunoo, Hebrew, Hiragana, KatakanaOrHiragana, OldItalic, KayahLi, Katakana, Kharoshthi, Khmer, Kannada, Lao, Latin, Lepcha, Limbu, LinearB, Lycian, Lydian, Malayalam, Mongolian, Myanmar, Nko, Ogham, OlChiki, Oriya, Osmanya, PhagsPa, Phoenician, Rejang, Runic, Saurashtra, Shavian, Sinhala, Sundanese, SylotiNagri, Syriac, Tagbanwa, TaiLe, NewTaiLue, Tamil, Telugu, Tifinagh, Tagalog, Thaana, Thai, Tibetan, Ugaritic, Vai, OldPersian, Cuneiform, Yi Seven other values are allowed: ipa - International Phonetic Alphabet hanja - Subset of 'han' used in writing Korean kanji - subset of 'han' used in writing Japanese math - mathematical symbols and related characters, representing the ISO 15924 code "Zmth" simplifiedHanzi - representing the ISO 15924 code "Hans" traditionalHanzi - representing the ISO 15924 code "Hant" user - special value denoting the 'native' input of the user according to the system environment. ================
Received on Wednesday, 13 August 2008 12:58:09 UTC