Re: Streamlining the OA Model from Robert Sanderson on 2012-08-01 (public-openannotation@w3.org from August 2012)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Wed, 1 Aug 2012 13:16:10 -0600
To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Cc: Paolo Ciccarese <paolo.ciccarese@gmail.com>, public-openannotation <public-openannotation@w3.org>
Message-ID: <CABevsUHV2nJN7yiB1Kap4cbJt4eri-7gG8uPTmV-Zpdb7mwXGA@mail.gmail.com>
To be clear, you know that this breaks the IETF restrictions on URI
fragments, right?
You can't just invent new fragment schemes for existing mime-types,
they MUST be specified in the mime type registration document.

This is only one reason we didn't go this way, but it's certainly a big one!
Others include that URIs should be opaque, the opportunity (if not
likelihood) for collision, the query-ability and so forth, as per the
media fragments discussion.

Rob

On Wed, Aug 1, 2012 at 9:31 AM, Sebastian Hellmann
<hellmann@informatik.uni-leipzig.de> wrote:
> Hello Paolo,
> let's separate the issues.
> Issue a) things you can represent with fragment selectors (expressivity)
> Issue b) syntax
>
> a) I am well aware of your use case. Do you have a benchmark that I could
> use for experiments? If you look at
> http://svn.aksw.org/papers/2012/NIF/EKAW_short_paper/public.pdf page 6, then
> you can see that NIF hash URIs are designed for robustness and to withstand
> changes made to Wikipedia. I am collecting a larger corpus currently, also
> including HTML. Do you have data sets or pages, which I could use?
> b) has nothing to do with a) . Truth is, however, that current fragment Ids
> are not designed to suit many use cases, but this is a shortcoming of issue
> a) expressivity and not the fault of the syntax. Let's say you could encode
> all information of OA selectors into fragment-id syntax. What is the harm
> done?
>
> I would really like to have a look into this. Is there a list with available
> selectors?  I found:
> http://code.google.com/p/annotation-ontology/wiki/v2Selectors
> But it only lists 4 classes of selectors and all are not very powerful.
> Sebastian
>
> Am 01.08.2012 17:12, schrieb Paolo Ciccarese:
>>
>> Dear Sebastian,
>> I produce annotation on webpages that I cannot control and I work with the
>> DOM. I mainly annotate scientific content with
>> http://annotationframework.org
>>
>> One example of why the counting and XPointer might not work is the fact
>> that pages includes  sections like advertisements and news which change
>> often.  There are even more simple examples, like having the document
>> displaying somewhere today's date. These modifications can fail selection
>> and counting and that is why, three years ago I started using different
>> mechanisms that are less affected - not immune unfortunately - to the
>> common changes in pages. About at the same time, the need emerged in the
>> OAC community as well.
>>
>> In general, Selectors also makes sense considering the need for annotating
>> media types other than HTML. For instance, Media Fragments fall short in
>> many of the already implemented use cases of video annotation tools.
>>
>> Hope this helps,
>> Paolo
>>
>>
>> On Wed, Aug 1, 2012 at 2:43 AM, Sebastian Hellmann <
>> hellmann@informatik.uni-leipzig.de> wrote:
>>
>>> Dear Paolo,
>>> Why wouldn't this work well?  It is based on RFC5147. Offset works for
>>> any
>>> string and therefore also HTML source. Problems arise, when you interpret
>>> strings. They do not work well for DOM, of course, but this is where one
>>> would rather use xPointer (W3C) . I guess, it also wouldn't work well to
>>> use an OA text selector on an image, right?
>>> With fragments, you definitely gain:
>>> - compatibility with the web (which also means free implementations)
>>> - less triples
>>> - less generated UUID's (if any at all)
>>>
>>> What do you gain, when using selectors?  I am not interested in
>>> theoretical/modelling issues. For me only things count that help you
>>> succeed in a use case.
>>> Building a parser for URIs is something very easy to implement, much
>>> easier in fact than understanding and working with selectors.
>>> Sebastian
>>>
>>>
>>> Am 31.07.2012 19:51, schrieb Paolo Ciccarese:
>>>
>>>   Is the mechanism
>>>>
>>>>
>>>> http://www.w3.org/**DesignIssues/LinkedData.html#**offset_717_729<http://www.w3.org/DesignIssues/LinkedData.html#offset_717_729>really
>>>>
>>>> working in general?
>>>>
>>>> In my experience it does not with HTML pages in general. That would mean
>>>> having lots of ways of composing the URIs that then need would need to
>>>> be
>>>> parsed. That is why we designed more complex selection mechanisms (
>>>>
>>>> http://www.openannotation.org/**spec/core/#Selector).<http://www.openannotation.org/spec/core/#Selector%29.>..
>>>>
>>>> and therefore more
>>>> triples.
>>>>
>>>> Paolo
>>>>
>>>
>>>
>>>
>>> --
>>> Dipl. Inf. Sebastian Hellmann
>>> Department of Computer Science, University of Leipzig
>>> Events:
>>>    * http://sabre2012.infai.org/**mlode
>>> <http://sabre2012.infai.org/mlode>(Leipzig, Sept. 23-24-25, 2012)
>>>
>>>    * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
>>> Projects: http://nlp2rdf.org , http://dbpedia.org
>>> Homepage:
>>> http://bis.informatik.uni-**leipzig.de/SebastianHellmann<http://bis.informatik.uni-leipzig.de/SebastianHellmann>
>>> Research Group: http://aksw.org
>>>
>>>
>>
>
>
> --
> Dipl. Inf. Sebastian Hellmann
> Department of Computer Science, University of Leipzig
> Events:
>   * http://sabre2012.infai.org/mlode (Leipzig, Sept. 23-24-25, 2012)
>   * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
> Projects: http://nlp2rdf.org , http://dbpedia.org
> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
> Research Group: http://aksw.org
>
Received on Wednesday, 1 August 2012 19:16:39 UTC