- From: r12a via GitHub <sysbot+gh@w3.org>
- Date: Fri, 20 May 2016 18:16:16 +0000
- To: www-international@w3.org
r12a has just labeled an issue for https://github.com/w3c/web-annotation as "i18n-review": == Reference to text encoding in spec perhaps not appropriate == #222 made me aware of the following text in the model spec: 4.2.4 Text Quote Selector https://www.w3.org/TR/2016/WD-annotation-model-20160331/#text-quote-selector > The text must be normalized before recording. Thus HTML/XML tags should be removed, character entities should be replaced with the character that they encode, unnecessary whitespace should be normalized, **character encoding should be turned into UTF-8**, and so forth. The normalization routine may be performed automatically by a browser, and other applications should implement the DOM String Comparisons method. This allows the Selector to be used with different encodings and user agents and still have the same semantics and utility. If all selector references are to be w.r.t. codepoint sequences (c.f. #206) then I'm not sure the spec should be referring to text encoding. (Because we're assuming that you're annotating unicode text, not some byte sequence.) See https://github.com/w3c/web-annotation/issues/227
Received on Friday, 20 May 2016 18:16:19 UTC