Re: [web-annotation] Selecting more than text from Randall Leeds on 2015-11-10 (public-annotation@w3.org from November 2015)

From: Randall Leeds <randall@bleeds.info>
Date: Tue, 10 Nov 2015 21:46:01 +0000
To: BigBlueHat via GitHub <sysbot+gh@w3.org>, public-annotation@w3.org
Message-ID: <CAAL6JQgSDRiK2rmoUCATkzO+6rrCaG1FTht0K3cut-8BkioZ-Q@mail.gmail.com>

The XPathSelector that's in discussion would support this:
https://github.com/w3c/web-annotation/issues/95

Let's say you have /html/body/article with content like:

<p>Here's a paragraph<p>
<img src="..." />
<p>Here's another paragraph</p>

You could select it with an XPath range like so:

start: '//article/node()[2]'
end: '//article/node()[3]'

On Tue, Nov 10, 2015 at 1:31 PM BigBlueHat via GitHub <sysbot+gh@w3.org>
wrote:

> BigBlueHat has just created a new issue for
> https://github.com/w3c/web-annotation:
>
> == Selecting more than text ==
> Right now, we specify 5
> [selectors](http://w3c.github.io/web-annotation/model/wd/#selectors):
>  -
> [FragmentSelector](
> http://w3c.github.io/web-annotation/model/wd/#fragment-selector)
>  -
> [TextQuoteSelector](
> http://w3c.github.io/web-annotation/model/wd/#text-quote-selector)
>  -
> [TextPositionSelector](
> http://w3c.github.io/web-annotation/model/wd/#text-position-selector)
>  -
> [DataPositionSelector](
> http://w3c.github.io/web-annotation/model/wd/#data-position-selector)
>  -
> [SVGSelector](http://w3c.github.io/web-annotation/model/wd/#svg-selector)
>
> @tkanai points out in
> https://github.com/w3c/web-annotation/issues/95#issuecomment-153966728
>  that there is not currently a way to include an `<img>` tag (and it's
>  representative visual output) within the selection.
>
> >From the comment mentioned above:
> > Could you tell me how to select "I (love)" words, or both "I" and
> the heart mark Image, from the html text below with the XPathSelector?
>  I also would like to make sure how to select "I" only.
> `<p>I <img src="love.png" /> New York</p>`
>
> > As I frequently encounter such paragraphs while I'm reading Japanese
>  eBooks, although images are "Kanji" characters, I am looking for an
> appropriate selector which can be applicable for non-normalized HTML
> documents.
>
> Here is an example of such text which uses images (and `<ruby>`
> markup, fwiw) for presenting Kanji characters to the reader:
> http://www.aozora.gr.jp/cards/000879/files/127_15260.html
>
> Here's an annotation made with Hypothes.is that shows the
> inefficiencies of the currently specified `Text*Selector` selectors:
> https://hypothes.is/a/kjbFiUKHSKaa1diyMQx1qQ
> and as JSON
> https://hypothes.is/api/annotations/kjbFiUKHSKaa1diyMQx1qQ
>
> Because the image content is not in the "normalized" markup, it can't
> be selected.
>
> It *may* be possible to use an XPointer FragmentSelector, but that
> seems dubious given the lack of implementations and dearth of
> knowledge around it (sadly :cry:).
>
> We need to consider this use case when specifying new selectors which
> are more "node focused" such as XPath and CSS based selectors--as they
>  present an opportunity to support this use case directly...or at
> least lay the foundation...one hopes.
>
> See https://github.com/w3c/web-annotation/issues/107
>
>

Received on Tuesday, 10 November 2015 21:46:40 UTC