Re: [web-annotation] Reference to text encoding in spec perhaps not appropriate from aphillips via GitHub on 2016-05-29 (public-annotation@w3.org from May 2016)

From: aphillips via GitHub <sysbot+gh@w3.org>
Date: Sun, 29 May 2016 16:26:11 +0000
To: public-annotation@w3.org
Message-ID: <issue_comment.created-222369302-1464539171-sysbot+gh@w3.org>

@iherman thanks. 

Regarding the `dir` attribute, one of the uses of markup is to provide
 help to the Unicode Bidirectional Algorithm (UBA) in laying out text 
for presentation. When the markup is removed, reducing the content to 
plain text, `dir` attributes can be replaced with the corresponding 
Unicode bidirectional control characters, preserving proper 
presentation. Several of our articles discuss this 
[here](https://www.w3.org/International/techniques/authoring-html#inline).
 

Regarding difficulty of implementation, most of my suggested text is 
straightforward to implement, but the boundary adjustment idea is a 
little hand-wavy. As I mentioned before, if a human is performing the 
text selection, it's difficult to select text that doesn't fall on a 
grapheme boundary. But programmatic access has to be taken into 
account as well. It would be easier on developers to say nothing and 
permit the boundary to fall on any character boundary, since generally
 the boundary won't "fall anywhere". But from a Unicode point of view,
 it would be better to specify grapheme boundaries or at least 
base-character starts.

-- 
GitHub Notification of comment by aphillips
Please view or discuss this issue at 
https://github.com/w3c/web-annotation/issues/227#issuecomment-222369302
 using your GitHub account

Received on Sunday, 29 May 2016 16:26:14 UTC