- From: r12a via GitHub <sysbot+gh@w3.org>
- Date: Wed, 01 Jun 2016 19:19:40 +0000
- To: public-annotation@w3.org
[1] I'm going to try to be careful about terminology here. When i refer to 'character normalization' i mean Unicode normalization forms (NFC, etc), as well as all the stuff referred to in the DOM String Comparisons. I am NOT referring to whitespace normalization. I believe we already agreed to remove the whitespace stuff, and it is gone from the text in the ED version. We have also dealt with the UTF-8 conversion, and the 'so forth', as @tilgovi says. [2] CHARACTER NORMALIZATION: If Ivan's requirement to be able to match text in the Text Quote Selector against text with different character normalizations in the target document holds (and i only hear him confirming that), then i don't see the value of normalizing the text on the way in to the model framework. You'll still have to do normalization at the *point of comparison* to achieve a match. I'm therefore inclined to drop the requirement for text to be character normalized before recording. That includes dropping the reference to DOM String Comparisons, which as i said before i don't think is really what you were looking for anyway, you were thinking of standard Unicode normalization forms. [3] MARKUP/ESCAPE REMOVAL: I'm still waiting for an answer from the WA WG to the second question at https://github.com/w3c/web-annotation/issues/227#issuecomment-222973597 in order to form a view on whether or not tag and escape folding should be part of the 'normalization paragraph'. I suspect that it shouldn't, but that it is just part of the method described for DOM Level 3 APIs. You don't want to strip markup or escapes from plain text sources that contain it, because they are examples. [4] We're trying to give you what advice we can, but we're not hearing much back that's clear and definite, and based on this thread, I share Addison's concern that perhaps the WA WG doesn't really know why that normalization stuff is there. Personally, I'd be inclined to remove the whole paragraph. -- GitHub Notification of comment by r12a Please view or discuss this issue at https://github.com/w3c/web-annotation/issues/227#issuecomment-223096862 using your GitHub account
Received on Wednesday, 1 June 2016 19:19:42 UTC