- From: spemberton via GitHub <sysbot+gh@w3.org>
- Date: Mon, 11 Jul 2016 13:06:32 +0000
- To: www-international@w3.org
spemberton has just created a new issue for https://github.com/w3c/bp-i18n-specdev: == Should include advice on specifying what a letter is. == Several specifications define "names". As one example, XML says (https://www.w3.org/TR/REC-xml/#NT-Nmtoken) NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040] It is really not clear where these list of characters come from, and why some of these are acceptable as name characters, and others not. Unicode has the concept of 'category values', http://www.unicode.org/reports/tr44/#General_Category_Values that classify characters as, for instance "Uppercase_Letter", "Lowercase_Letter", etc. It seems to me that it would be good advice for specification writers to use the Unicode Category Values as basis for defining (amongst other things) names, rather than apparently randomly chosen lists of character numbers. See https://github.com/w3c/bp-i18n-specdev/issues/16 Please do NOT reply to this email. If you'd like to contribute to the discussion, please do so at the above link. You will need to subscribe yourself to the issue (using the button provided by that page) to receive notifications of further comments.
Received on Monday, 11 July 2016 13:06:41 UTC