[web-annotation] Issue: Reference to text encoding in spec perhaps not appropriate marked as i18n-review

r12a has just labeled an issue for 
https://github.com/w3c/web-annotation as "i18n-review":

== Reference to text encoding in spec perhaps not appropriate ==
#222 made me aware of the following text in the model spec:

4.2.4 Text Quote Selector
https://www.w3.org/TR/2016/WD-annotation-model-20160331/#text-quote-selector
> The text must be normalized before recording. Thus HTML/XML tags 
should be removed, character entities should be replaced with the 
character that they encode, unnecessary whitespace should be 
normalized, **character encoding should be turned into UTF-8**, and so
 forth. The normalization routine may be performed automatically by a 
browser, and other applications should implement the DOM String 
Comparisons method. This allows the Selector to be used with different
 encodings and user agents and still have the same semantics and 
utility.

If all selector references are to be w.r.t. codepoint sequences (c.f. 
#206) then I'm not sure the spec should be referring to text encoding.
 (Because we're assuming that you're annotating unicode text, not some
 byte sequence.)

See https://github.com/w3c/web-annotation/issues/227

Received on Friday, 20 May 2016 18:16:19 UTC