- From: Addison Phillips via GitHub <sysbot+gh@w3.org>
- Date: Fri, 30 Sep 2016 19:20:57 +0000
- To: public-annotation@w3.org
@azaroth42: +1 While working in code points is *awesome*, the reality of the Web is often that of UTF-16 code units because of DOM String. While the APIs and data structures based on UTF-16 code units do not directly insulate users from problems with surrogate pairs (and, neither surrogates handling nor code point counting deal at all with grapheme clustering), proper character handling can and should still be provided by higher level implementation and protocols. No process needs to deal with surrogate code *points* (that is, character values in the range U+D800 to U+DFFF). There is no reason to state, however, that, just because offsets are defined in UTF-16 code units that a process cannot handle supplementary characters (i.e. characters represented by a surrogate pair of code *units*) I18N WG commented about an identical issue at TPAC, but I'm at a loss to put my finger on it just now. -- GitHub Notification of comment by aphillips Please view or discuss this issue at https://github.com/w3c/web-annotation/issues/350#issuecomment-250830391 using your GitHub account
Received on Friday, 30 September 2016 19:21:12 UTC