- From: Felix Sasaki via GitHub <sysbot+gh@w3.org>
- Date: Mon, 30 May 2016 09:31:20 +0000
- To: public-annotation@w3.org
Hi all, just to emphasize one point that Ivan made: selectors are not only for HTML/XML markup. Hence, in the algorithm proposed at https://github.com/w3c/web-annotation/issues/227#issuecomment-222330988 the step "Remove all markup, such as HTML or XML tags." is not applicable for other content formats on the Web. PDF is just one example format. On the step "Normalization of whitespace by collapsing all whitespace tokens to a single ASCII space character (U+0020). " For certain markup vocabularies (and for non markup content types as well), certain types of elements want to preserve white space. E.g. for the HTML pre element you would not want to remove white space. Emphasizing again: web annotation is for any type of web content. E.g. if I am putting DocBook content on the web and want to annotate programlisting elements, their whitespace should be preserved. IMO for above reasons, the qualifier 'if applicable' is very important. I assume that many implementers will leave white space handling to the underlying library that handles low level content parsing. For example, during the ITS 2.0 development, I developed an implementation that parsed HTML content using validator.nu . The white space handling was left to that library. I assume the same for others. -- GitHub Notification of comment by fsasaki Please view or discuss this issue at https://github.com/w3c/web-annotation/issues/227#issuecomment-222452770 using your GitHub account
Received on Monday, 30 May 2016 09:31:22 UTC