Re: Selectors as URIs? from Liam R. E. Quin on 2015-04-18 (public-annotation@w3.org from April 2015)

From: Liam R. E. Quin <liam@w3.org>
Date: Sat, 18 Apr 2015 14:34:48 -0400
To: Ivan Herman <ivan@w3.org>
Cc: Randall Leeds <randall@bleeds.info>, Robert Sanderson <azaroth42@gmail.com>, Paolo Ciccarese <paolo.ciccarese@gmail.com>, W3C Public Annotation List <public-annotation@w3.org>, Bill Kasdorf <bkasdorf@apexcovantage.com>, Tzviya Siegman <tsiegman@wiley.com>, Markus Gylling <markus.gylling@gmail.com>
Message-ID: <1429382088.18767.11.camel@w3.org>
On Tue, 2015-04-14 at 11:38 +0200, Ivan Herman wrote:
> > On 13 Apr 2015, at 23:05 , Randall Leeds <randall@bleeds.info> 
> > wrote:
> > 
> > Rob's answer is much better than mine, but points to the same 
> > solution, I think.
> > 
> > If you control the media type and the meaning of fragments in that 
> > context, then you don't need our permission to put an OA selector 
> > in the fragment.
> 
> I am not looking for a permission, I am looking for a coordination:-)


URI fragments are defined by media type registrations based on media 
type. For example, the fragment identifier syntax for XML documents is 
defined to be XPointer.

This is arguably bad architecture, because it fails on content 
negotiation. So when someone asked on the dpub call I probably should 
have said "no, you can't use a URI for this".

You can, however, invert it:
annotations://annotation-server.example.org/?uri=yyy;xpath=/book/chapter[3]//figure[@src='me.jpg']/ancestor::para/wordspan(5,17);mode-
highlight

The annotation server could issue a redirect -- but a client-side 
engine could simply rewrite this to an annotated URI (yyy).

So I think there's scope for creativity.
 

> Sure, we can define those URI fragments. Is it o.k. if a group just 
> does that without coordinating with those who are behind the OA 
> Selector model? I do not think so…
> 
> Actually, I also wonder whether the serialization of the selectors 
> in terms of URI-s would not come handy to our own deliverables, too 
> (again, with possible restrictions as for the media types). Eg, 
> handling URI-s that way with existing URI libraries in the various 
> programming languages around us might be handy… (But, I admit, I am 
> just handwaving here…)
> 
> Ivan
> 
> 
> > 
> > 
> > On Mon, Apr 13, 2015, 11:04 Robert Sanderson <azaroth42@gmail.com> 
> > wrote:
> > On Mon, Apr 13, 2015 at 10:07 AM, Ivan Herman <ivan@w3.org> wrote: 
> > Hm. The problem is that there is a use case here that we may have 
> > to accommodate somehow.
> > At the moment, if you take an Ebook, and you want to have a URI 
> > identifying a specific position within a specific chapter of a 
> > book, what you can use something like:
> > 
> > http://www.example.org/book#epubcfi(/6/4[chap01ref]!/4[body01]/10[para05]/3:10)
> > 
> > 
> > Sure, because the Epub media type registration defines the meaning 
> > of fragment section of URIs where Epub is a representation that 
> > can be retrieved.
> > We can't legitimately just add #oa:Selector(...) to the end of the 
> > URI instead.
> > 
> > 
> > epubcfi works, and is used, but it has its drawbacks (let me not 
> > go into all the details). One drawback is that what it offers as 
> > anchoring possibility though powerful) is way less flexible than 
> > the selector model, primarily the range selectors. The conceptual 
> > model behind those would become useful, as an alternative to 
> > something like epubcfi, if those structures could be used as 
> > fragments.
> > 
> > Agreed, and the same applies for every other media type as well.
> > 
> > 
> > Maybe we have to restrict its usage and define it only for 
> > specific media types (text, etc) to avoid the issues in your 
> > example on genetic sequences or full blown graphics. But believe 
> > something like that would be very useful and, for some 
> > communities, necessary.
> > 
> > The point from 3986 (and related) is that we _cannot_ define it 
> > for specific media types unless we control them.  It's summarized 
> > in the first bullet in the annotation spec I linked to.
> > For example, the meaning of a fragment on a plain text document is 
> > defined by 5147: https://tools.ietf.org/html/rfc5147
> > 
> > So we can't just say that people should use #oa:Selector(...) with 
> > a plain text document (or any other format) :(
> > 
> > Rob
> > 
> > 
> > 
> > (b.t.w., I am not sure I understand your comment on RFC3986)
> > [1] http://www.idpf.org/epub/linking/cfi/epub-cfi.html
> > 
> > > On 13 Apr 2015, at 18:28 , Robert Sanderson <azaroth42@gmail.com
> > > > wrote:
> > > 
> > > 
> > > We discussed fragments in the community group at length.
> > > 
> > > The concerns about the approach are documented here:
> > >     http://www.w3.org/TR/annotation-model/#fragment-uris
> > > 
> > > These boil down to the fact that as you get more sophisticated 
> > > selections the URI becomes unbearably long.
> > > Consider serializing an entire SVG document into the URI to 
> > > specify a non rectangular area. Or selecting the previous and 
> > > following 1024 Gs Cs As and Ts to select a range of text in a 
> > > genetic sequence.
> > > 
> > > My personal position is that selectors should not be turned into 
> > > fragments, because (especially) that would break the rules of 
> > > fragment identifiers as laid out in RFC 3986:
> > > 
> > > The semantics of a fragment identifier are defined by the set of 
> > > representations that might result from a retrieval action on the 
> > > primary resource.
> > > As further discussed by JeniT here:
> > >     http://www.w3.org/TR/fragid-best-practices/
> > > 
> > > Basically, unless there's a new text/HTML RFC that allows us to 
> > > do it, we can't arbitrarily shove the description of the segment 
> > > into its identity.
> > > 
> > > Rob
> > > 
> > > 
> > > On Mon, Apr 13, 2015 at 9:10 AM, Ivan Herman <ivan@w3.org> 
> > > wrote: (Although this may not be immediately relevant to the 
> > > Working Group right now, I think the question *may* become 
> > > relevant, hence my copy to it…)
> > > 
> > > Rob, Paolo,
> > > 
> > > a question came up at the Digital Publishing IG today. The IG is 
> > > looking at general fragment identifiers for the purpose of 
> > > identifying portions within a digital document (typically EPUB, 
> > > but also some future versions of it). The Selector structure of 
> > > the OA obviously gives a great model for various types of 
> > > anchors, mainly when combined with other, existing fragment id 
> > > definitions.
> > > 
> > > However, at present, the selectors are defined in terms of RDF 
> > > resources; to take an example from the spec, it says, for example
> > > 
> > > selector": {
> > >       "@id": "http://example.org/selector1",
> > >       "@type": "oa:DataPositionSelector",
> > >       "start": 4096,
> > >       "end": 4104
> > > }
> > > 
> > > To be usable for a fragment identification, this structure 
> > > should be turned into some sort of a, well, URI fragment. I 
> > > mean, it is probably relatively easy to do this, something like
> > > 
> > > http://www.example.org/#selector(type=DataPositionSelector,start=4096,end=4104)
> > > 
> > > 
> > > would do it but, of course, the ideal would be if that type of 
> > > fragment format would be defined at one place.
> > > 
> > > The question is: has this ever been discussed previously on the 
> > > OA model? If it hasn't been done, should it be done? If it 
> > > should be done, should it be done by this WG, or some other 
> > > group?
> > > 
> > > Thanks
> > > 
> > > Ivan
> > > 
> > > 
> > > ----
> > > Ivan Herman, W3C
> > > Digital Publishing Activity Lead
> > > Home: http://www.w3.org/People/Ivan/
> > > mobile: +31-641044153
> > > ORCID ID: http://orcid.org/0000-0003-0782-2704
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > --
> > > Rob Sanderson
> > > Information Standards Advocate
> > > Digital Library Systems and Services
> > > Stanford, CA 94305
> > 
> > 
> > ----
> > Ivan Herman, W3C
> > Digital Publishing Activity Lead
> > Home: http://www.w3.org/People/Ivan/
> > mobile: +31-641044153
> > ORCID ID: http://orcid.org/0000-0003-0782-2704
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > --
> > Rob Sanderson
> > Information Standards Advocate
> > Digital Library Systems and Services
> > Stanford, CA 94305
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
> 
> 
> 
>
Received on Saturday, 18 April 2015 18:34:55 UTC