[bp-i18n-specdev] Should include advice on specifying what a letter is. from spemberton via GitHub on 2016-07-11 (public-i18n-archive@w3.org from July to September 2016)

From: spemberton via GitHub <sysbot+gh@w3.org>
Date: Mon, 11 Jul 2016 13:06:32 +0000
To: public-i18n-archive@w3.org
Message-ID: <issues.opened-164832465-1468242390-sysbot+gh@w3.org>

spemberton has just created a new issue for 
https://github.com/w3c/bp-i18n-specdev:

== Should include advice on specifying what a letter is. ==
Several specifications define "names". As one example, XML says 
(https://www.w3.org/TR/REC-xml/#NT-Nmtoken)

NameStartChar      ::=          ":" | [A-Z] | "_" | [a-z] | 
[#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | 
[#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
 [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | 
[#x10000-#xEFFFF]

NameChar           ::=          NameStartChar | "-" | "." | [0-9] | 
#xB7 | [#x0300-#x036F] | [#x203F-#x2040]

It is really not clear where these list of characters come from, and 
why some of these are acceptable as name characters, and others not.

Unicode has the concept of 'category values', 
http://www.unicode.org/reports/tr44/#General_Category_Values that 
classify characters as, for instance "Uppercase_Letter", 
"Lowercase_Letter", etc.

It seems to me that it would be good advice for specification writers 
to use the Unicode Category Values as basis for defining (amongst 
other things) names, rather than apparently randomly chosen lists of 
character numbers.

Please view or discuss this issue at 
https://github.com/w3c/bp-i18n-specdev/issues/16 using your GitHub 
account

Received on Monday, 11 July 2016 13:06:41 UTC