Re: Streamlining the OA Model from Paolo Ciccarese on 2012-08-01 (public-openannotation@w3.org from August 2012)

From: Paolo Ciccarese <paolo.ciccarese@gmail.com>
Date: Wed, 1 Aug 2012 13:04:40 -0400
To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Cc: public-openannotation <public-openannotation@w3.org>, Robert Sanderson <azaroth42@gmail.com>
Message-ID: <CAFPX2kD7zJRQLbVaV0rx6DC1ZDN1rDp--h=0VMwysi499r_bfg@mail.gmail.com>
thank you, I'll put the pdf in the high-priority readings pipeline.
Iinitially I've thought of going in a similar direction but I could not
come out with a nice scheme that could scale to everything we needed.

Just a curiosity, in the case of a selection of multiple not contiguous -
and not equals - text spans, you create a more complicated URI or the idea
is to use  multiple URI together with some other mechanism? Is it explained
in the paper already?

The documentation of Annotation Ontology is outdated and I doubt I will
ever update it at this point. I don't have benchmarks nor organized content
I can share for this specific task, the selectors I use in Domeo have been
the result of trials and failures, no big theory. Does anybody in this
Group have something to share related to text selection benchmarking?

Currently, for selectors, we will try to update/keep up to date -
accordingly to what the consensus is - this section:
http://www.openannotation.org/spec/extension/#Selector

Best,
Paolo

On Wed, Aug 1, 2012 at 11:31 AM, Sebastian Hellmann <
hellmann@informatik.uni-leipzig.de> wrote:

> Hello Paolo,
> let's separate the issues.
> Issue a) things you can represent with fragment selectors (expressivity)
> Issue b) syntax
>
> a) I am well aware of your use case. Do you have a benchmark that I could
> use for experiments? If you look at http://svn.aksw.org/papers/**
> 2012/NIF/EKAW_short_paper/**public.pdf<http://svn.aksw.org/papers/2012/NIF/EKAW_short_paper/public.pdf>page 6, then you can see that NIF hash URIs are designed for robustness and
> to withstand changes made to Wikipedia. I am collecting a larger corpus
> currently, also including HTML. Do you have data sets or pages, which I
> could use?
> b) has nothing to do with a) . Truth is, however, that current fragment
> Ids are not designed to suit many use cases, but this is a shortcoming of
> issue a) expressivity and not the fault of the syntax. Let's say you could
> encode all information of OA selectors into fragment-id syntax. What is the
> harm done?
>
> I would really like to have a look into this. Is there a list with
> available selectors?  I found:
> http://code.google.com/p/**annotation-ontology/wiki/**v2Selectors<http://code.google.com/p/annotation-ontology/wiki/v2Selectors>
> But it only lists 4 classes of selectors and all are not very powerful.
> Sebastian
>
> Am 01.08.2012 17:12, schrieb Paolo Ciccarese:
>
>> Dear Sebastian,
>> I produce annotation on webpages that I cannot control and I work with the
>> DOM. I mainly annotate scientific content with
>> http://annotationframework.org
>>
>> One example of why the counting and XPointer might not work is the fact
>> that pages includes  sections like advertisements and news which change
>> often.  There are even more simple examples, like having the document
>> displaying somewhere today's date. These modifications can fail selection
>> and counting and that is why, three years ago I started using different
>> mechanisms that are less affected - not immune unfortunately - to the
>> common changes in pages. About at the same time, the need emerged in the
>> OAC community as well.
>>
>> In general, Selectors also makes sense considering the need for annotating
>> media types other than HTML. For instance, Media Fragments fall short in
>> many of the already implemented use cases of video annotation tools.
>>
>> Hope this helps,
>> Paolo
>>
>>
>> On Wed, Aug 1, 2012 at 2:43 AM, Sebastian Hellmann <
>> hellmann@informatik.uni-**leipzig.de <hellmann@informatik.uni-leipzig.de>>
>> wrote:
>>
>>  Dear Paolo,
>>> Why wouldn't this work well?  It is based on RFC5147. Offset works for
>>> any
>>> string and therefore also HTML source. Problems arise, when you interpret
>>> strings. They do not work well for DOM, of course, but this is where one
>>> would rather use xPointer (W3C) . I guess, it also wouldn't work well to
>>> use an OA text selector on an image, right?
>>> With fragments, you definitely gain:
>>> - compatibility with the web (which also means free implementations)
>>> - less triples
>>> - less generated UUID's (if any at all)
>>>
>>> What do you gain, when using selectors?  I am not interested in
>>> theoretical/modelling issues. For me only things count that help you
>>> succeed in a use case.
>>> Building a parser for URIs is something very easy to implement, much
>>> easier in fact than understanding and working with selectors.
>>> Sebastian
>>>
>>>
>>> Am 31.07.2012 19:51, schrieb Paolo Ciccarese:
>>>
>>>   Is the mechanism
>>>
>>>> http://www.w3.org/****DesignIssues/LinkedData.html#****offset_717_729<http://www.w3.org/**DesignIssues/LinkedData.html#**offset_717_729>
>>>> <http://www.w3.**org/DesignIssues/LinkedData.**html#offset_717_729<http://www.w3.org/DesignIssues/LinkedData.html#offset_717_729>
>>>> >really
>>>>
>>>> working in general?
>>>>
>>>> In my experience it does not with HTML pages in general. That would mean
>>>> having lots of ways of composing the URIs that then need would need to
>>>> be
>>>> parsed. That is why we designed more complex selection mechanisms (
>>>> http://www.openannotation.org/****spec/core/#Selector)<http://www.openannotation.org/**spec/core/#Selector%29>
>>>> .<http:/**/www.openannotation.org/spec/**core/#Selector%29<http://www.openannotation.org/spec/core/#Selector%29>
>>>> .>..
>>>>
>>>> and therefore more
>>>> triples.
>>>>
>>>> Paolo
>>>>
>>>>
>>>
>>>
Received on Wednesday, 1 August 2012 17:05:15 UTC