W3C home > Mailing lists > Public > public-openannotation@w3.org > October 2012

Re: URI fragments

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Fri, 5 Oct 2012 15:56:21 +0200
Message-ID: <506EE705.6020805@few.vu.nl>
To: <public-openannotation@w3.org>
Hi Rob, all,

Thanks for the explanation! Like Nick, I had a lot of doubts about the rationale for this. And I still have. Though now it is not about the way it is expressed in your mail, but about the way it is expressed in the spec :-)

All your arguments are indeed in favour of creating precise structures to represent fragments. But still I think we all agree that it is not a reason to forbid users to create simple representation using fragments URIs. Many people just don't give a damn about Selectors, identifying-vs.-resolving issues, and so on. For example because they have much simpler annotation targets, which are entirely captured by what the Media Fragment URI offer. It would be a pity to deter them to use OA.
There's also the borderline case, where users of resolvable media fragment URIs also use the patterns of OA to represent e.g., states. Would you disallow this?

I reckon that you write in the mail "it's only a recommendation to use this approach". But reading the OA spec, especially with the parts on specific bodies and target, I find this really not obvious. Most of the text reads "you should use the precise pattern", at least to me, as a newcomer (and non-native speaker maybe). If just because of all the efforts you put in explaining why people should think of using selectors ;-)

So I'm all for keeping the current arguments in, but I strongly suggest to:
- add one explicit sentence in section 5, which tells that using resolvable fragment URIs is allowed, even though it has drawbacks in several cases. - change the graphs in section 5, 3.4 and 4.4, which indicate that the Specific resource is non-resolvable.
- change the text with "typically identified by a URN" in 3.4 and 4.4. It's maybe my poor English, but to me this text reads like you're trying to state a law which should be generally true.

In fact it may be a good idea, from an editorial perspective, to factor out the discussion (on resolvable vs. non-resolvable bodies and targets) from the definitions of the constructs from the OA vocabulary for specific bodies. Maybe in a new section 5.6 on "use of resolvable and non-resolvable identifiers for bodies and targets"?

Antoine



> Hi Nick, and all,
>
> There are two possibilities, listed below, for annotating parts of
> resources.  We decided on the more expressive but more verbose second
> option for a variety of reasons, that I'll try to unpack a little bit
> from the specification.
>
> Option (1) Fragment URI as target
> _:anno a oa:Annotation ;
>    oa:hasTarget<http://www.example.com/example.ogv#t=10,20>  ;
>    oa:hasBody<http://www.example.org/comment1>  .
>
>
> Option (2) Identity as target, description as a Selector
> _:anno a oa:Annotation ;
>    oa:hasTarget<SpTarget1>  ;
>    oa:hasBody<http://www.example.org/comment1>  .
>
> <SpTarget1>  a oa:SpecificResource ;
>    oa:hasSelector<FragSel1>  ;
>    oa:hasSource<http://www.example.com/example.ogv>  .
>
> <FragSel1>  a oa:FragmentSelector ;
>    rdf:value "t=10,20" .
>
> -------
>
> To try and decompress some of the reasoning in the specification:
> ( http://www.openannotation.org/spec/core/#SelectorFragment )
>
> * You can't search for http://www.example.com/example.ogv directly in
> the first model.  Remember that URIs, for the purposes of Sparql etc
> are opaque, non-decomposable strings.  Regardless of whether that
> string may include human readable semantics or not, you can't discover
> annotations of form (1) and you can discover them in form (2) by
> querying the object of oa:hasSource.
>
> * While Style specifiers are going away, form (1) is not compatible
> with States.  If you need to refer to example.ogv at a particular
> point in time then you need a State which cannot be attached to the
> Fragment URI, as that would break the global scope of statements in
> RDF.  In other words, you would be saying that for all uses of that
> time range within the video, it was always based on the video resource
> as it was at a particular point in time (say at 2011-05-20), which
> would prevent other annotations from having a different point in time
> for that same time segment within the video.
>
> * URIs provide identity. A URI with a fragment provides both identity
> for the segment, but also a description of how to resolve that segment
> given a (particular) representation of the resource. This has several
> issues:
> (a) There may be many ways to describe the same segment, used by
> different communities.
> (b) IETF Fragment specifications are tied to a specific mimetype.
> Your example of plain text fragments works only for text/plain
> resources, and no other. Thus the identity of the segment is tied to a
> specific representation in a specific format
> (c) As Jeni Tennison points out, the same fragment can identify
> different things within the same resource.  She gives the example of
> an SVG document with embedded RDF in Appendix B. (
> http://www.w3.org/TR/2012/WD-fragid-best-practices-20120726/ )
> So we think it's safer to create a new node in the graph to provide
> identity, and a separate node to provide the description.  This
> safety, expressiveness and consistency comes at the expense of some
> extra bytes, but that's RDF for you.
>
> * Fragment URIs are not expressive enough to cover the use cases that
> drive the Open Annotation specification.  Non rectangular sections of
> images are very important to be able to identify and describe,
> including simple circles as well as arbitrary paths.  The worst case
> is annotating a diagonal road in a map from top left to bottom right
> of the image, where a rectangular box would encompass the whole
> image's content. Thus we need some other way to implement this, which
> results in form (2)
>
> * Fragment URIs do not cover all media types, nor could they possibly
> hope to.  If you wanted to annotate a selection of text within a MS
> Word document, you would need Microsoft to register a fragment
> description for .doc and .docx.  Given that this is a useful sort of
> thing to do, and it can't be done with fragment URIs, we need a
> selector concept as in form (2)
>
> * They are not extensible, and especially Media Fragment URIs which we
> lobbied hard for before the specification was finalized, but to no
> avail.  As soon as anyone needed something slightly richer or more
> expressive ... like a circular area rather than rectangular ... then
> we would be back in the situation where we needed a selector again.
> If they were extensible, this would have helped.
>
> * Thus, there is a need for a Selector that describes the segment of a
> resource separately from its identity.  Given that this is required,
> we felt it most consistent to always use a Selector, but to import the
> fragment description semantics into it.  This solves all of the issues
> above at the expense of being somewhat more verbose.
>   - You can always query oa:hasSource to find the URI of the target
> resource, without any segment information
>   - You, or a third party, can always attach a State to give the time
> for the representation.  The failure to consider the dynamic nature of
> web resources has been the downfall of many annotation systems in the
> past, and we fully intend to learn from their mistakes.
>   - Multiple descriptions are possible, and can work across mime types.
>   There is no confusion about what the Specific Resource identifies as
> it does not also try to describe it.
>   - We can be as expressive as we like using a Selector, and remain
> consistent with a single model
>   - We can have selectors for new and old media types, without the
> blessing of the IANA/IETF registries
>   - Selectors are infinitely extensible
>   - There is a single model, not two possibilities that everyone would
> need to implement both of or risk splitting their user base
>   - We import the semantics of the fragment definitions, so are not
> re-inventing those.  We simply split the fragment away from the URI of
> the full resource to gain the benefits of the above.
>   - And finally we tried both ways and the consensus of the group was
> that the single selector model was the better approach
>
> After all of that, if you're still not convinced, then as Paolo says
> it's only a recommendation to use this approach.  If you feel that
> some additional bytes in an already extremely verbose format is too
> high a cost for interoperability, expressiveness, consistency and the
> understandability of your annotations, then it is not forbidden to
> annotate a fragment URI directly. I hope, of course, that you're
> convinced otherwise by the arguments above :)
>
> Rob
>
> On Thu, Oct 4, 2012 at 7:12 AM, Paolo Ciccarese
> <paolo.ciccarese@gmail.com>  wrote:
>> Hi Nick,
>> the idea is that you can use the fragments URIs but not directly as they
>> are.
>>
>> Given the current structure of the OA model we *recommend* to split source
>> and fragment for the reasons that are listed in the specs. In other words,
>> if you use a fragment URI directly that might work for you but we wanted to
>> make clear that, within this model, that would create problems in using
>> other features, querying, sharing and recording additional provenance info.
>>
>> As for http://www.ietf.org/rfc/rfc2119.txt The adjective "RECOMMENDED", mean
>> that there may exist valid reasons in particular circumstances to ignore a
>> particular item, but the full implications must be understood and carefully
>> weighed before choosing a different course.
>>
>> Therefore if you have
>> http://www.example.com/example.ogv#t=10,20
>> we recommend to brake it down:
>>
>> <SpTarget1>  a oa:SpecificResource ;
>>      oa:hasSelector<Selector1>  ;
>>      oa:hasSource<http://www.example.com/example.ogv>  .
>>
>>    <Selector1>  a oa:FragmentSelector ;
>>      rdf:value "t=10,20" .
>>
>> And the Fragment URI may be reconstructed by concatenating the oa:hasSource
>> resource's URI, plus a '#', plus the value of the Fragment Selector. As OA
>> model is a format for exchange, the application consuming the annotation dat
>> is supposed to perform the operation when necessary.
>>
>> Of course, I agree the above set of triples does not look as compact as
>> http://www.example.com/example.ogv#t=10,20 is. But in general terms, we know
>> that approach causes side effects.
>>
>> Hope this helps,
>> Paolo
>>
>>
>>
>>
>> On Thu, Oct 4, 2012 at 7:00 AM, Nick White<nick.white@durham.ac.uk>  wrote:
>>>
>>> Hi,
>>>
>>> I am very interested in the work OpenAnnotation is doing. It looks
>>> like it could be very useful indeed.
>>>
>>> In reading the spec, section 5.2.1 "Fragment Selector"
>>> <http://www.openannotation.org/spec/core/#SelectorFragment>, it
>>> recommends against using fragment URIs to identify segments.
>>>
>>> I don't really understand the rationale for this. The language
>>> used in the spec is not easy for me to follow. Please could somebody
>>> clarify the reasons for me?
>>>
>>> It seems to me (in my ignorance, no doubt) that standard URI
>>> fragment selectors are an obvious and good choice. I was planning to
>>> use RFC5147 to refer to sections of text, which is a nice, simple
>>> way of doing so. It's basic, but fine for my needs, and being human-
>>> readable and easily usable in other contexts has its advantages.
>>>
>>> Thanks for any guidance, and I look forward to exploring
>>> OpenAnnotation more.
>>>
>>> Nick White
>>>
>>>
>>
>>
>>
>> --
>> Dr. Paolo Ciccarese
>> http://www.paolociccarese.info/
>> Biomedical Informatics Research&  Development
>> Instructor of Neurology at Harvard Medical School
>> Assistant in Neuroscience at Mass General Hospital
>> +1-857-366-1524 (mobile)   +1-617-768-8744 (office)
>>
>> CONFIDENTIALITY NOTICE: This message is intended only for the addressee(s),
>> may contain information that is considered
>> to be sensitive or confidential and may not be forwarded or disclosed to any
>> other party without the permission of the sender.
>> If you have received this message in error, please notify the sender
>> immediately.
>>
>>
>
Received on Friday, 5 October 2012 14:37:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 5 October 2012 14:37:03 GMT