Re: URI fragments from Paolo Ciccarese on 2012-10-31 (public-openannotation@w3.org from October 2012)

From: Paolo Ciccarese <paolo.ciccarese@gmail.com>
Date: Tue, 30 Oct 2012 23:13:15 -0400
To: Bernhard Haslhofer <bernhard.haslhofer@cornell.edu>
Cc: Antoine Isaac <aisaac@few.vu.nl>, Robert Sanderson <azaroth42@gmail.com>, public-openannotation@w3.org
Message-ID: <CAFPX2kB_c8qQNzN5+14xMfvwq9_gpaiEOAF+ZF+KyUp6M3u7Uw@mail.gmail.com>
The use of  fragment URI is not forbidden by the current model,
as explained at the bottom of this page:
http://www.w3.org/community/openannotation/fragment-uris/

At this point, as Bernhard said, I would wait to see real-world
implementation before discussing this further.

Best,
paolo


On Tue, Oct 30, 2012 at 10:56 PM, Bernhard Haslhofer <
bernhard.haslhofer@cornell.edu> wrote:

> Hi,
>
> a while ago, I expressed similar concerns after implementing the Open
> Annotation model in our Maphub API (
> http://lists.w3.org/Archives/Public/public-openannotation/2012Jul/0067.html)
> and my opinion hasn't changed yet. I also still believe that the existing
> media fragment specifications are sufficient for a variety of use cases and
> that the Open Annotation specification should encourage people to use MF
> with annotations just as they use them in any other context. Some use cases
> will require more complex MF definitions and for those the Open Annotation
> model could provide some extension points.
>
> But my feeling is that we should postpone this discussion until we have
> some more real-world URIs returning annotations serialized in OA.
>
> Bernhard
>
>
> On Tuesday, October 30, 2012 at 6:31 PM, Antoine Isaac wrote:
>
> > Hi Rob,
> >
> > Sorry for letting the thread lay down so long. First I was hoping you'd
> get some feedback for these people you were calling to, but it did not work.
> > So maybe everyone is ok with it. Or the group has not reached critical
> mass...
> >
> > Anyway, I'm a bit worried about shifting the burden from the annotation
> creator or to the consumer. I'm not sure I see it like you. First, any
> barrier to data publishers can lack of data. If you don't make it
> super-easy to publish stuff that people have, publishers will look
> elsewhere.
> >
> > And it is quite likely that annotation tools for which the fragment URIs
> are enough won't seek to use anything else. It's a W3C recommendation,
> after all. And data consumers seeking to re-use pieces of software that
> locate regions in document, who don't need more than Media
> Fragments-supported features, will probably have more, better software
> based on media fragments. And they well decide too that they don't want to
> go further.
> >
> > So you'd end up with a quite big part of the data being (1) isolated
> from other cases (at least these publishers and consumers won't build
> gateways) (2) against the recommendation.
> >
> > Of course the question of relative mass will be key: who will need only
> Media Fragment vs. who will need a more general solution. Honestly, I'm not
> an expert on this myself. But I trust a W3C group to have anticipated the
> optimal mass they could address with the solution they were designing.
> >
> > Best,
> >
> > Antoine
> >
> >
> > > The drawback is that all client developers will have to implement both
> > > ways of representing the same information, which is highly undesirable
> > > for an interoperability specification. It shifts the burden from the
> > > annotation creator to the consumer. As there will hopefully be many
> > > more consuming applications than creating ones, this either creates a
> > > large amount of additional work or splits the adoption community in to
> > > those systems that do one versus those that do the other.
> > >
> > > Third party systems could rewrite from the fragment uri to the
> > > selector version, or vice versa, and express the equivalence with
> > > prov:alternateOf. Clients would then rely on these systems, and not
> > > the original creators of the annotations, making them gatekeepers that
> > > new annotation systems would have to get adoption from.
> > >
> > > I personally would like to see further discussion, and in particular
> > > feedback from people who would not implement the specification as it
> > > stands with fragment targets being not recommended, given the
> > > rationale presented in the previous mails.
> > >
> > > Rob
> > >
> > >
> > > On Fri, Oct 5, 2012 at 9:32 AM, Paolo Ciccarese
> > > <paolo.ciccarese@gmail.com (mailto:paolo.ciccarese@gmail.com)> wrote:
> > > > Antoine,
> > > > thank you for the suggestions, I do agree with you that it would not
> make
> > > > sense to forbid the direct use of fragments URIs. As you said we
> don't
> > > > currently do that in the spec... however given the current wording we
> > > > clearly pushed in one direction.
> > > >
> > > > In the last months, it turned out there are many users/groups out
> there -
> > > > especially those belonging to the NLP community - that prefer the
> direct use
> > > > of fragment URIs for good reasons and it seems that is enough to
> achieve the
> > > > interoperability they need - for instance for comparing the
> performance of
> > > > different algorithms. In their shoes, even after understanding the
> > > > drawbacks, I would probably do the same.
> > > >
> > > > As the direct use of fragments URIs is anyway already allowed, I am
> in favor
> > > > of both the spec-related changes you proposed.
> > > >
> > > > Best,
> > > > Paolo
> > > >
> > > >
> > > > On Fri, Oct 5, 2012 at 9:56 AM, Antoine Isaac<aisaac@few.vu.nl(mailto:
> aisaac@few.vu.nl)> wrote:
> > > > >
> > > > > Hi Rob, all,
> > > > >
> > > > > Thanks for the explanation! Like Nick, I had a lot of doubts about
> the
> > > > > rationale for this. And I still have. Though now it is not about
> the way it
> > > > > is expressed in your mail, but about the way it is expressed in
> the spec :-)
> > > > >
> > > > > All your arguments are indeed in favour of creating precise
> structures to
> > > > > represent fragments. But still I think we all agree that it is not
> a reason
> > > > > to forbid users to create simple representation using fragments
> URIs. Many
> > > > > people just don't give a damn about Selectors,
> identifying-vs.-resolving
> > > > > issues, and so on. For example because they have much simpler
> annotation
> > > > > targets, which are entirely captured by what the Media Fragment
> URI offer.
> > > > > It would be a pity to deter them to use OA.
> > > > > There's also the borderline case, where users of resolvable media
> fragment
> > > > > URIs also use the patterns of OA to represent e.g., states. Would
> you
> > > > > disallow this?
> > > > >
> > > > > I reckon that you write in the mail "it's only a recommendation to
> use
> > > > > this approach". But reading the OA spec, especially with the parts
> on
> > > > > specific bodies and target, I find this really not obvious. Most
> of the text
> > > > > reads "you should use the precise pattern", at least to me, as a
> newcomer
> > > > > (and non-native speaker maybe). If just because of all the efforts
> you put
> > > > > in explaining why people should think of using selectors ;-)
> > > > >
> > > > > So I'm all for keeping the current arguments in, but I strongly
> suggest
> > > > > to:
> > > > > - add one explicit sentence in section 5, which tells that using
> > > > > resolvable fragment URIs is allowed, even though it has drawbacks
> in several
> > > > > cases. - change the graphs in section 5, 3.4 and 4.4, which
> indicate that
> > > > > the Specific resource is non-resolvable.
> > > > > - change the text with "typically identified by a URN" in 3.4 and
> 4.4.
> > > > > It's maybe my poor English, but to me this text reads like you're
> trying to
> > > > > state a law which should be generally true.
> > > > >
> > > > > In fact it may be a good idea, from an editorial perspective, to
> factor
> > > > > out the discussion (on resolvable vs. non-resolvable bodies and
> targets)
> > > > > from the definitions of the constructs from the OA vocabulary for
> specific
> > > > > bodies. Maybe in a new section 5.6 on "use of resolvable and
> non-resolvable
> > > > > identifiers for bodies and targets"?
> > > > >
> > > > > Antoine
> > > > >
> > > > >
> > > > >
> > > > > > Hi Nick, and all,
> > > > > >
> > > > > > There are two possibilities, listed below, for annotating parts
> of
> > > > > > resources. We decided on the more expressive but more verbose
> second
> > > > > > option for a variety of reasons, that I'll try to unpack a
> little bit
> > > > > > from the specification.
> > > > > >
> > > > > > Option (1) Fragment URI as target
> > > > > > _:anno a oa:Annotation ;
> > > > > > oa:hasTarget<http://www.example.com/example.ogv#t=10,20> ;
> > > > > > oa:hasBody<http://www.example.org/comment1> .
> > > > > >
> > > > > >
> > > > > > Option (2) Identity as target, description as a Selector
> > > > > > _:anno a oa:Annotation ;
> > > > > > oa:hasTarget<SpTarget1> ;
> > > > > > oa:hasBody<http://www.example.org/comment1> .
> > > > > >
> > > > > > <SpTarget1> a oa:SpecificResource ;
> > > > > > oa:hasSelector<FragSel1> ;
> > > > > > oa:hasSource<http://www.example.com/example.ogv> .
> > > > > >
> > > > > > <FragSel1> a oa:FragmentSelector ;
> > > > > > rdf:value "t=10,20" .
> > > > > >
> > > > > > -------
> > > > > >
> > > > > > To try and decompress some of the reasoning in the specification:
> > > > > > ( http://www.openannotation.org/spec/core/#SelectorFragment )
> > > > > >
> > > > > > * You can't search for http://www.example.com/example.ogvdirectly in
> > > > > > the first model. Remember that URIs, for the purposes of Sparql
> etc
> > > > > > are opaque, non-decomposable strings. Regardless of whether that
> > > > > > string may include human readable semantics or not, you can't
> discover
> > > > > > annotations of form (1) and you can discover them in form (2) by
> > > > > > querying the object of oa:hasSource.
> > > > > >
> > > > > > * While Style specifiers are going away, form (1) is not
> compatible
> > > > > > with States. If you need to refer to example.ogv at a particular
> > > > > > point in time then you need a State which cannot be attached to
> the
> > > > > > Fragment URI, as that would break the global scope of statements
> in
> > > > > > RDF. In other words, you would be saying that for all uses of
> that
> > > > > > time range within the video, it was always based on the video
> resource
> > > > > > as it was at a particular point in time (say at 2011-05-20),
> which
> > > > > > would prevent other annotations from having a different point in
> time
> > > > > > for that same time segment within the video.
> > > > > >
> > > > > > * URIs provide identity. A URI with a fragment provides both
> identity
> > > > > > for the segment, but also a description of how to resolve that
> segment
> > > > > > given a (particular) representation of the resource. This has
> several
> > > > > > issues:
> > > > > > (a) There may be many ways to describe the same segment, used by
> > > > > > different communities.
> > > > > > (b) IETF Fragment specifications are tied to a specific mimetype.
> > > > > > Your example of plain text fragments works only for text/plain
> > > > > > resources, and no other. Thus the identity of the segment is
> tied to a
> > > > > > specific representation in a specific format
> > > > > > (c) As Jeni Tennison points out, the same fragment can identify
> > > > > > different things within the same resource. She gives the example
> of
> > > > > > an SVG document with embedded RDF in Appendix B. (
> > > > > > http://www.w3.org/TR/2012/WD-fragid-best-practices-20120726/ )
> > > > > > So we think it's safer to create a new node in the graph to
> provide
> > > > > > identity, and a separate node to provide the description. This
> > > > > > safety, expressiveness and consistency comes at the expense of
> some
> > > > > > extra bytes, but that's RDF for you.
> > > > > >
> > > > > > * Fragment URIs are not expressive enough to cover the use cases
> that
> > > > > > drive the Open Annotation specification. Non rectangular
> sections of
> > > > > > images are very important to be able to identify and describe,
> > > > > > including simple circles as well as arbitrary paths. The worst
> case
> > > > > > is annotating a diagonal road in a map from top left to bottom
> right
> > > > > > of the image, where a rectangular box would encompass the whole
> > > > > > image's content. Thus we need some other way to implement this,
> which
> > > > > > results in form (2)
> > > > > >
> > > > > > * Fragment URIs do not cover all media types, nor could they
> possibly
> > > > > > hope to. If you wanted to annotate a selection of text within a
> MS
> > > > > > Word document, you would need Microsoft to register a fragment
> > > > > > description for .doc and .docx. Given that this is a useful sort
> of
> > > > > > thing to do, and it can't be done with fragment URIs, we need a
> > > > > > selector concept as in form (2)
> > > > > >
> > > > > > * They are not extensible, and especially Media Fragment URIs
> which we
> > > > > > lobbied hard for before the specification was finalized, but to
> no
> > > > > > avail. As soon as anyone needed something slightly richer or more
> > > > > > expressive ... like a circular area rather than rectangular ...
> then
> > > > > > we would be back in the situation where we needed a selector
> again.
> > > > > > If they were extensible, this would have helped.
> > > > > >
> > > > > > * Thus, there is a need for a Selector that describes the
> segment of a
> > > > > > resource separately from its identity. Given that this is
> required,
> > > > > > we felt it most consistent to always use a Selector, but to
> import the
> > > > > > fragment description semantics into it. This solves all of the
> issues
> > > > > > above at the expense of being somewhat more verbose.
> > > > > > - You can always query oa:hasSource to find the URI of the target
> > > > > > resource, without any segment information
> > > > > > - You, or a third party, can always attach a State to give the
> time
> > > > > > for the representation. The failure to consider the dynamic
> nature of
> > > > > > web resources has been the downfall of many annotation systems
> in the
> > > > > > past, and we fully intend to learn from their mistakes.
> > > > > > - Multiple descriptions are possible, and can work across mime
> types.
> > > > > > There is no confusion about what the Specific Resource
> identifies as
> > > > > > it does not also try to describe it.
> > > > > > - We can be as expressive as we like using a Selector, and remain
> > > > > > consistent with a single model
> > > > > > - We can have selectors for new and old media types, without the
> > > > > > blessing of the IANA/IETF registries
> > > > > > - Selectors are infinitely extensible
> > > > > > - There is a single model, not two possibilities that everyone
> would
> > > > > > need to implement both of or risk splitting their user base
> > > > > > - We import the semantics of the fragment definitions, so are not
> > > > > > re-inventing those. We simply split the fragment away from the
> URI of
> > > > > > the full resource to gain the benefits of the above.
> > > > > > - And finally we tried both ways and the consensus of the group
> was
> > > > > > that the single selector model was the better approach
> > > > > >
> > > > > > After all of that, if you're still not convinced, then as Paolo
> says
> > > > > > it's only a recommendation to use this approach. If you feel that
> > > > > > some additional bytes in an already extremely verbose format is
> too
> > > > > > high a cost for interoperability, expressiveness, consistency
> and the
> > > > > > understandability of your annotations, then it is not forbidden
> to
> > > > > > annotate a fragment URI directly. I hope, of course, that you're
> > > > > > convinced otherwise by the arguments above :)
> > > > > >
> > > > > > Rob
> > > > > >
> > > > > > On Thu, Oct 4, 2012 at 7:12 AM, Paolo Ciccarese
> > > > > > <paolo.ciccarese@gmail.com (mailto:paolo.ciccarese@gmail.com)>
> wrote:
> > > > > > >
> > > > > > > Hi Nick,
> > > > > > > the idea is that you can use the fragments URIs but not
> directly as they
> > > > > > > are.
> > > > > > >
> > > > > > > Given the current structure of the OA model we *recommend* to
> split
> > > > > > > source
> > > > > > > and fragment for the reasons that are listed in the specs. In
> other
> > > > > > > words,
> > > > > > > if you use a fragment URI directly that might work for you but
> we wanted
> > > > > > > to
> > > > > > > make clear that, within this model, that would create problems
> in using
> > > > > > > other features, querying, sharing and recording additional
> provenance
> > > > > > > info.
> > > > > > >
> > > > > > > As for http://www.ietf.org/rfc/rfc2119.txt The adjective
> "RECOMMENDED",
> > > > > > > mean
> > > > > > > that there may exist valid reasons in particular circumstances
> to ignore
> > > > > > > a
> > > > > > > particular item, but the full implications must be understood
> and
> > > > > > > carefully
> > > > > > > weighed before choosing a different course.
> > > > > > >
> > > > > > > Therefore if you have
> > > > > > > http://www.example.com/example.ogv#t=10,20
> > > > > > > we recommend to brake it down:
> > > > > > >
> > > > > > > <SpTarget1> a oa:SpecificResource ;
> > > > > > > oa:hasSelector<Selector1> ;
> > > > > > > oa:hasSource<http://www.example.com/example.ogv> .
> > > > > > >
> > > > > > > <Selector1> a oa:FragmentSelector ;
> > > > > > > rdf:value "t=10,20" .
> > > > > > >
> > > > > > > And the Fragment URI may be reconstructed by concatenating the
> > > > > > > oa:hasSource
> > > > > > > resource's URI, plus a '#', plus the value of the Fragment
> Selector. As
> > > > > > > OA
> > > > > > > model is a format for exchange, the application consuming the
> annotation
> > > > > > > dat
> > > > > > > is supposed to perform the operation when necessary.
> > > > > > >
> > > > > > > Of course, I agree the above set of triples does not look as
> compact as
> > > > > > > http://www.example.com/example.ogv#t=10,20 is. But in general
> terms, we
> > > > > > > know
> > > > > > > that approach causes side effects.
> > > > > > >
> > > > > > > Hope this helps,
> > > > > > > Paolo
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Oct 4, 2012 at 7:00 AM, Nick White<
> nick.white@durham.ac.uk (mailto:nick.white@durham.ac.uk)>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I am very interested in the work OpenAnnotation is doing. It
> looks
> > > > > > > > like it could be very useful indeed.
> > > > > > > >
> > > > > > > > In reading the spec, section 5.2.1 "Fragment Selector"
> > > > > > > > <http://www.openannotation.org/spec/core/#SelectorFragment>,
> it
> > > > > > > > recommends against using fragment URIs to identify segments.
> > > > > > > >
> > > > > > > > I don't really understand the rationale for this. The
> language
> > > > > > > > used in the spec is not easy for me to follow. Please could
> somebody
> > > > > > > > clarify the reasons for me?
> > > > > > > >
> > > > > > > > It seems to me (in my ignorance, no doubt) that standard URI
> > > > > > > > fragment selectors are an obvious and good choice. I was
> planning to
> > > > > > > > use RFC5147 to refer to sections of text, which is a nice,
> simple
> > > > > > > > way of doing so. It's basic, but fine for my needs, and
> being human-
> > > > > > > > readable and easily usable in other contexts has its
> advantages.
> > > > > > > >
> > > > > > > > Thanks for any guidance, and I look forward to exploring
> > > > > > > > OpenAnnotation more.
> > > > > > > >
> > > > > > > > Nick White
>
>
>


-- 
Dr. Paolo Ciccarese
http://www.paolociccarese.info/
Biomedical Informatics Research & Development
Instructor of Neurology at Harvard Medical School
Assistant in Neuroscience at Mass General Hospital
+1-857-366-1524 (mobile)   +1-617-768-8744 (office)

CONFIDENTIALITY NOTICE: This message is intended only for the addressee(s),
may contain information that is considered
to be sensitive or confidential and may not be forwarded or disclosed to
any other party without the permission of the sender.
If you have received this message in error, please notify the sender
immediately.
Received on Wednesday, 31 October 2012 03:13:44 UTC