Re: [i18n-activity] UTF-16 code points for addressable characters

@klensin I don't necessarily agree. The counting in SVG2 is supposed 
to be tied to DOM String, which is encoded in UTF-16 and uses UTF-16 
code units for counting. The problem that @r12a is highlighting is 
that SVG introduced (a long time ago, in a specification far, far 
away---aka SVG1) different terminology instead of using code unit as 
the term. The term "addressable character" appears to mean "(index of 
a) UTF-16 code unit".  I agree that the current wording is excessively
 opaque. By not using standard Unicode terminology directly and by 
introducing confusing concepts such as "character addressing", the 
specification is much harder to understand (and that Unicode newbies 
will miss the implications entirely). But I also note that this 
terminology ship has sailed :-(. 

I think that what the text above is trying to say is:

> The address of a given Unicode character (codepoint) is measured in 
UTF-16 code units, prior to applying any text-transform conversions, 
as described for the methods in the SVGTextContentElement interface; 
as a result, a single Unicode character may be represented by multiple
 UTF-16 code units.

GitHub Notification of comment by aphillips
Please view or discuss this issue at
 using your GitHub account

Received on Monday, 5 September 2016 23:30:11 UTC