- From: spemberton via GitHub <sysbot+gh@w3.org>
- Date: Mon, 11 Jul 2016 13:06:32 +0000
- To: public-i18n-archive@w3.org
spemberton has just created a new issue for https://github.com/w3c/bp-i18n-specdev: == Should include advice on specifying what a letter is. == Several specifications define "names". As one example, XML says (https://www.w3.org/TR/REC-xml/#NT-Nmtoken) NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040] It is really not clear where these list of characters come from, and why some of these are acceptable as name characters, and others not. Unicode has the concept of 'category values', http://www.unicode.org/reports/tr44/#General_Category_Values that classify characters as, for instance "Uppercase_Letter", "Lowercase_Letter", etc. It seems to me that it would be good advice for specification writers to use the Unicode Category Values as basis for defining (amongst other things) names, rather than apparently randomly chosen lists of character numbers. Please view or discuss this issue at https://github.com/w3c/bp-i18n-specdev/issues/16 using your GitHub account
Received on Monday, 11 July 2016 13:06:41 UTC