W3C home > Mailing lists > Public > www-international@w3.org > July to September 2012

I18N-ISSUE-187: escape syntax [TURTLE]

From: Internationalization Core Working Group Issue Tracker <sysbot+tracker@w3.org>
Date: Fri, 07 Sep 2012 15:46:40 +0000
Message-Id: <E1TA0lU-0000wk-Hx@tibor.w3.org>
To: www-international@w3.org, public-rdf-comments@w3.org
I18N-ISSUE-187: escape syntax [TURTLE]

http://www.w3.org/International/track/issues/187

Raised by: Addison Phillips
On product: TURTLE

Section 6.4. The \u (lowercase u) syntax allows:

<q>
A Unicode codepoint in the range U+0 to U+FFFF inclusive corresponding to the value encoded by the four hexadecimal digits interpreted from most significant to least significant digit.
</q>

This is probably wrong, given that the surrogate code points fall into this range. No mention is made of surrogate pair handling.

It's not clear why the \U form should take eight hex digits when the first two are required to be 0.

Also, the trend seems to be going towards the variable-width form "\u{xxxxx}". See, for example: 

http://unicode.org/reports/tr18/#Hex_notation
http://norbertlindenberg.com/2012/05/ecmascript-supplementary-characters/index.html#Escapes 
Received on Friday, 7 September 2012 15:46:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 7 September 2012 15:46:48 GMT