- From: Nick Stenning via GitHub <sysbot+gh@w3.org>
- Date: Wed, 18 May 2016 09:07:11 +0000
- To: public-annotation@w3.org
nickstenning has just created a new issue for https://github.com/w3c/web-annotation: == Reference to text encoding in spec perhaps not appropriate == #222 made me aware of the following text in the model spec: 4.2.4 Text Quote Selector https://www.w3.org/TR/2016/WD-annotation-model-20160331/#text-quote-selector > The text must be normalized before recording. Thus HTML/XML tags should be removed, character entities should be replaced with the character that they encode, unnecessary whitespace should be normalized, **character encoding should be turned into UTF-8**, and so forth. The normalization routine may be performed automatically by a browser, and other applications should implement the DOM String Comparisons method. This allows the Selector to be used with different encodings and user agents and still have the same semantics and utility. If all selector references are to be w.r.t. codepoint sequences (c.f. #206) then I'm not sure the spec should be referring to text encoding. (Because we're assuming that you're annotating unicode text, not some byte sequence.) Please view or discuss this issue at https://github.com/w3c/web-annotation/issues/227 using your GitHub account
Received on Wednesday, 18 May 2016 09:07:13 UTC