Re: Use Cases from Lutz Suhrbier on 2014-03-07 (public-openannotation@w3.org from March 2014)

From: Lutz Suhrbier <l.suhrbier@bgbm.org>
Date: Fri, 07 Mar 2014 19:24:01 +0100
To: public-openannotation@w3.org
Message-ID: <531A0EC1.8060106@bgbm.org>
Hi Bob,

I think your idea of defining a general applicable oa:QuerySelector and 
including it into the Spec matches most use cases in data publication.
A URI identifying the query language and the query expression itself 
should be fairly enough to identify data elements in any kind of data 
resource.

As things are so easy to address, are there any more obstacles 
preventing to add the "annotation of segments in any queryable data 
source" to the use cases ?
And, of course, in the next revision of the spec ?

best regards
Lutz

> As mentioned in the thread you cite, we introduced an extension
> oad:QuerySelector to oa:Selector that allows the specification of a
> query that can be the object of a property oad:hasQuery.  To a certain
> extent, sometimes only this property is needed to address your use
> case, but  subclassing  oad:QuerySelector a URI identifies the query
> language, which could be difficult to infer from the query text.
> Conversely, sometimes the query terms can just hang on the Selector
> and don't need formal identification as query text. Arguably, our
> solutions are mainly convenience terms and serve some common cases
> much as do the oa:Motivation common cases. That's possibly the
> direction that would evolve from Paolo's arguments in the thread.
>
> Using those notions, here's an example that probably is easily adapted
> to your case, perhaps by introducing a RegularExpressionQuerySelector.
>   This one asserts that anything in the SpecificResource that asserts
> latitude beyond the poles must be wrong.
>
> :anno a oa:Annotation ;
>      oa:hasTarget :sptarget1 ;
>      oa:hasBody <http://filteredpush.org/rangeViolation> ;
>      oa:hasBody :invalidLatitudeText .
>
> :sptarget a oa:SpecificResource ;
>      oa:hasSource <urn:lsid:biocol.org:col:15406> ;
>      oa:hasSelector :findInvalidLatitudes .
>      :findInvalidLatitudes a oad:SparqlQuerySelector ;
>       oad:hasQuery :queryContent .
>
> :queryContent a cnt:ContentAsText ;
>     cnt:chars "SELECT distinct ?x WHERE {
>        ?x a dwcFP:Occurrence .
>        ?x dwc:decimalLatitude ?lat .
>        FILTER (?lat > 90 || ?lat < -90).}" .
>
> :invalidLatitudeText a cnt:ContentAsText ;
>      cnt:chars  "The value of latitude is out of range and invalid" .
>
> A little closer in detail to your use case can be found in
> http://bit.ly/1jIJiOW , where the annotator has proposed a fix to a
> specific set of resources that are given by Key-Value pairs in the
> domain vocabulary. It avoids the oad:hasQuery predicate.  To that
> extent, it needn't subclass the Selector to oad:QuerySelector, and I'm
> not sure whether that is good or bad.
>
> Bob
>
> On Sat, Mar 1, 2014 at 1:04 PM, Tim Cole <t-cole3@illinois.edu> wrote:
>> So here's another idea for consideration for the Note on Annotation Use
>> Cases. It stems from discussions about methods we might use to curate
>> retrospectively digitized texts such as are found in the HathiTrust.
>> Reactions?
>>
>>
>>
>> Repeated Segment Annotations
>>
>>
>>
>> A user wishes to target for annotation segments of a resource that appear
>> more than once in that resource, termed here for convenience 'repeated
>> segments' (e.g., a string, node or name that appears multiple times in a
>> single digitized book). For the purposes of what's being expressed in the
>> annotation, the user does not need to know (has not determined a priori) the
>> exact number of times the repeated segment appears in the resource; the
>> interpretation of the annotation is understood to be independent of the
>> number of instances of the repeated segment in the resource. This use case
>> is defined to address situations where the body of the annotation relates in
>> the same way to each repeated segment instance. Similarly for a body
>> comprised of repeated segments.
>>
>>
>>
>> Examples
>>
>>
>>
>> ·         A copy editor creates an annotation proposing a correction to
>> replace all instances of the string "pleaf'd" with the string "pleas'd".
>> Essentially the annotation is proposing a search and replace operation of
>> the sort ubiquitous in modern word processing systems.
>>
>> ·         A manufacturer wishes to annotate all products of a certain class
>> in his products database with a note that these items will go on sale next
>> week for 15% off for 2 weeks only.
>>
>> ·         A publisher wishes to associate an annotation containing an
>> updated email address with all author nodes having the value "Jane A. Smith"
>> that appear in last year's journal volume.
>>
>>
>>
>> Notes
>>
>>
>>
>> ·         In the absence of an oa:State triple, the annotation would be
>> assumed persistent even if some instances of the repeated segment are
>> removed from the resource or  if additional instances of the repeated
>> segment are added to the resource. The annotation is rendered inoperative
>> (nonsense) only if all instances of the repeated segment are removed.
>>
>> ·         While challenging to address, this use case should be addressed
>> since it can happen inadvertently as well as intentionally. Certain classes
>> of selectors will be prone to describing/identifying segments that occur
>> multiple times in a resource.  For example, in a lengthy text there is some
>> small but finite chance that there will be more than one match for the same
>> oa:exact, oa:prefix, and oa:suffix combination (e.g., the constituents of an
>> oa:TextQuoteSelector).  As has already come up, we can anticipate that
>> communities will want to begin using other kinds of selectors, e.g., CSS,
>> XPath/XPpointer, SQL-based, SPARQL-based, etc. that have an even greater
>> probability of describing and identifying repeated segments in a lengthy
>> resource.
>>
>> ·         A further extension of this use case (or perhaps the slippery
>> slope reason not to allow) might be its potential use with multiplicity
>> constructs.
>>
>>
>>
>> I suspect that the 'data selector' discussion that Bob, Paolo and others
>> have raised previously in other threads (most recently in
>> http://lists.w3.org/Archives/Public/public-openannotation/2014Feb/0011.html)
>> is relevant here in terms of how the current OA model might be applied or if
>> necessary extended to implement this use case. Though this suggestion was
>> stimulated by the copy edit example (which has come up most recently in the
>> context of the HathiTrust Research Center), there is perhaps in fact a lot
>> of overlap with various data query use cases.
>>
>>
>>
>> A relevant question (I think) is whether (in the context of RDF and OA) we
>> can unambiguously give identity as a single Resource (e.g., as an extension
>> of the oa:SpecificResource class)  to what is essentially a not yet
>> enumerated ad hoc aggregation of oa:SpecificResources?  Perhaps there's a
>> bit of a Schrödinger's Cat issue lurking here.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Tim Cole
>>
>> University of Illinois at UC
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> From: Robert Sanderson [mailto:azaroth42@gmail.com]
>> Sent: Monday, February 24, 2014 4:52 PM
>> To: public-openannotation
>> Subject: Use Cases
>>
>>
>>
>>
>>
>> Dear all,
>>
>>
>>
>> The W3C Digital Publishing Interest Group is going to publish a working
>> draft of a Note on Annotation use cases in the near future.  I have put a
>> pre-working draft (whatever that means :) ) of the text up at:
>>
>>
>>
>>    http://www.openannotation.org/usecases.html
>>
>>
>>
>> Any comments, corrections, additions, etc are very welcome!
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Rob
>>
>>
>>
>> P.S. Bob, unfortunately data annotation directly isn't in scope of the IG
>> work, but I've included it under the embedded resource use case to try and
>> promote the discussion.
>
>
Received on Friday, 7 March 2014 18:24:30 UTC