Re: Using oa:SpecificResource with oa:hasTarget from James Smith on 2012-07-06 (public-openannotation@w3.org from July 2012)

From: James Smith <jgsmith@gmail.com>
Date: Fri, 6 Jul 2012 09:32:45 -0400
To: <t-cole3@illinois.edu>
Cc: <public-openannotation@w3.org>
Message-Id: <D34E8F08-5E1F-4814-96D8-779B8CB0D0B4@gmail.com>
Here's my take on this based on my reading of the OA background and what I gather are the guiding principles of the OA data model. So this is as much about me finding out what I'm not understanding as it is me trying to reason through to a solution.

If we start with the premise that OAC relies on (or prefers) linked data methodologies, then I would look at the problem in terms of resources and things we say about those resources. OAC does for annotation what REST does for CRUD: it distinguishes between resources and things that operate on those resources. Nouns and verbs (or predicates).

We've had quite a bit of conversation about trying to keep selection out of the target URI. That's why we have fragment identifiers separate from the target URI. We've tried to determine a way to have a canonical URI that lets us find all annotations that might apply to a particular resource.

The problem is that djatoka seems to fight canonical URIs at every turn. There's a verb lurking in the djatoka URL (the svc_id=...getRegion is part of it), breaking any chance at it being REST or linked data. Don't get me started on the rest of the mess that makes up the URL. This doesn't mean that djatoka can't work with OAC, but we have to be careful about how we have the two work together.

Target URIs in OA are opaque strings. I should be able to take the target URI, perform a GET operation (or equivalent depending on the protocol specified in the URI) and have in hand a representation of the resource appropriate for the annotation, though I may need some hints if I need to use content negotiation in a pure REST framework where the representation isn't part of the resource name -- do I want the image/jpeg or the application/json representation?

If we use a djatoka URI that sends back part of an image, then the annotation will only target a representation of that part of the image. Two names (URIs) indicate two different resources. Shift out to a different zoom level and the annotation no longer applies. If we shift the requested region, we lose the annotation. To do otherwise requires the client to interpret the URIs beyond what is needed to retrieve the named resource.

If we want to attach the annotation to the image and not just a region of the image (barring any subselection within the image), then the target URI must be the name of the unmodified resource (for djatoka, the resource in the original size and resolution without any of the processing instructions in the URI). Any selection of a region must be done through selecting which part of the target is being annotated. If a client is smart enough to recognize the URI as a djatoka resource (or the djatoka service is fronted by a REST service that provides the information), then the client can retrieve the subregion of the image that contains the area targeted by the annotation.

So I would construct the target by starting with the most general resource to which the annotation could apply and then scoping down until we have the part of the resource the annotation is targeting. If we want to target a rectangular region of an image, then the target URI would address the full image and a fragment selector could use the 'xywh=...' form to select the area within that image.

It is up to the client to apply the fragment selector to the resource to figure out what to display. OA isn't about specifying the rendering, but the relationships that can be used to create the rendering (similar to the difference between a procedural and a declarative programming language). It might be useful to flag a target image as being served by a djatoka server (something we would find useful in the Shelley-Godwin Archive), but I would leave the use of the djatoka RPC-style URIs to the client.

This is probably the easiest way to get the OA linked data and djatoka RPC styles to work together. It also means that the annotations aren't as dependent on the djatoka implementation. If a client doesn't understand djatoka, they can still get the original (possibly large) image and show the annotation. They just can't retrieve a smaller part of the image or zoom in on it (except by doing all of the work djatoka would do, but on the client side).

My general unease with putting djatoka-specific stuff into OA is that djatoka is a specific implementation of a solution, not a general description.

With respect to specifying that a target is only a particular representation of a resource, would dc:format work? At least, we should have something like dc:format instead of something tied to HTTP. I can see a use for language if a document is available in multiple languages and they are considered variants of the same resource.

-- Jim


On Jul 5, 2012, at 5:40 PM, Tim Cole <t-cole3@illinois.edu> wrote:

> In our experimentation with the Open Annotation data model, some questions about best practices with regard to describing annotation targets have come up. These are (in part at least) about making choices between OA-compliant options for describing a given annotation. Which options are generally best in which circumstances? I'd be interested in comments, suggestions and feedback from others experimenting with the data model.
>  
> Consider the description of an annotation having as its target a region of a Web-accessible image. In this case the annotation target is a segment of page image 9 (counting from cover) from a digitized copy of "Emblematum Sacrorum Quorum..." (http://hdl.handle.net/10111/UIUCOCA:duodeksemblema00saub). Assume that the annotator/annotation tool is making use of our local djatoka service at Illinois to view the JP2 image and determine pixel box of the region of interest.  Ignoring details to do with the annotation body and with part of relationships between page image and the book as a whole, the description of such an annotation might be expressed as:
>  
> <http://example.org/myAnnotation1> a oa:Annotation ;
>         oa:hasBody <...> ;
>         oa:hasTarget <http://djatoka.grainger.illinois.edu/adore-djatoka/resolver?url_ver=Z39.88-2004&rft_id=http://emblemimages.grainger.illinois.edu/duodeksemblema00saub/JP2Processed/duodeksemblema00saub_0009.jp2&svc_id=info:lanl-repo/svc/getRegion&svc_val_fmt=info:ofi/fmt:kev:mtx:jpeg2000&svc.format=image/jp2&svc.region=890,700,200,150> ..
>  
> 1.       Providing oa:hasSource and oa:hasSelector:
>  
> The lengthy URI of the target is djatoka-service specific and so not all that helpful to non-djatoka applications that might encounter this annotation trolling for annotations of the page image involved. This suggests that at a minimum, it might be more useful to express this annotation target as an oa:SpecificResource having a W3C Media Fragment selector, e.g.:
>  
> <http://example.org/myAnnotation1> a oa:Annotation  ;
>         oa:hasBody <...>  ;
>         oa:hasTarget <urn:uuid:11111...> .
>  
> <urn:uuid:11111...> a oa:SpecificResource;
>         oa:hasSource <http://emblemimages.grainger.illinois.edu/duodeksemblema00saub/JP2Processed/duodeksemblema00saub_0009.jp2> ;
>         oa:hasSelector < urn:uuid:22222...> .
>  
> <urn:uuid:22222...> a oa:FragmentSelector ;
>         rdf:value "xywh=pixel:890,700,200,150" .
>  
> First question – the current draft of OA core spec (section 4.4) says, "The Specific Target is typically identified by a URN, as an HTTP URI would imply that the exact nature of the Specific Target was available to retrieve by dereferencing the HTTP URI." In this case, of course, the region of the image wanted can be retrieved via the djatoka URL given above. This suggests the possibility of using the URL as the identifier of the target, i.e., instead of using <urn:uuid:11111...>. You would then get something like:
>  
> <http://example.org/myAnnotation1> a oa:Annotation  ;
>         oa:hasBody <...>  ;
>         oa:hasTarget <http://djatoka.grainger.illinois.edu/adore-djatoka/resolver?url_ver=Z39.88-2004&rft_id=http://emblemimages.grainger.illinois.edu/duodeksemblema00saub/JP2Processed/duodeksemblema00saub_0009.jp2&svc_id=info:lanl-repo/svc/getRegion&svc_val_fmt=info:ofi/fmt:kev:mtx:jpeg2000&svc.format=image/jp2&svc.region=890,700,200,150> ..
>  
> <http://djatoka.grainger.illinois.edu/adore-djatoka/resolver?url_ver=Z39.88-2004&rft_id=http://emblemimages.grainger.illinois.edu/duodeksemblema00saub/JP2Processed/duodeksemblema00saub_0009.jp2&svc_id=info:lanl-repo/svc/getRegion&svc_val_fmt=info:ofi/fmt:kev:mtx:jpeg2000&svc.format=image/jp2&svc.region=890,700,200,150>  a oa:SpecificResource;
>         oa:hasSource <http://emblemimages.grainger.illinois.edu/duodeksemblema00saub/JP2Processed/duodeksemblema00saub_0009.jp2> ;
>         oa:hasSelector < urn:uuid:22222...> .
>  
> <urn:uuid:22222...> a oa:FragmentSelector ;
>         rdf:value "xywh=pixel:890,700,200,150" .
>  
> But this raises questions. <http://djatoka.grainger.illinois.edu/adore-djatoka...region=890,700,200,150> is a oa:SpecificResource in the context of my annotation, but is that statement and the oa:hasSource and oa:hasSelector statements about the image fragment okay in regard to this resource in other contexts?  I would argue (tentatively) yes, but it gets a little fuzzier if someone reuses this same djatoka URL as the URI of a oa:specificResouce target in another annotation and attaches an oa:hasStyle triple not appropriate in the context of my annotation. There's also the possibility that some consuming applications might not bother to examine the oa:hasSource and oa:hasSelector statements (okay for some purposes, but not all). Having a readily de-referenceable URI as the object of oa:hasTarget seems convenient, but in the case of an oa:Specific Resource, is it viable even in this very basic example?
>  
> 2.       Adding oa:hasState
>  
> Of course, a user / client application trying to view the annotation given above might prefer to fetch the image fragment not as a fragment of an image/jp2 file, but rather as fragment of an image/jpeg. Djatoka is perfectly able to accommodate:
>  
> <http://example.org/myAnnotation1> a oa:Annotation ;
>         oa:hasBody <...> ;
>         oa:hasTarget <http://djatoka.grainger.illinois.edu/adore-djatoka/resolver?url_ver=Z39.88-2004&rft_id=http://emblemimages.grainger.illinois.edu/duodeksemblema00saub/JP2Processed/duodeksemblema00saub_0009.jp2&svc_id=info:lanl-repo/svc/getRegion&svc_val_fmt=info:ofi/fmt:kev:mtx:jpeg2000&svc.format=image/jpeg&svc.region=890,700,200,150> ..
>  
> So, my second question is about best practice for adding oa:hasState property to my oa:SpecificTarget – this is left open (i.e., to be dealt with later) in the current draft of the OA extension namespace. For this example, assume that the page image in question is available from a service that can do content negotiation, e.g.:
>  
> Assume that de-referencing the URI:
>   http://emblemimages.grainger.illinois.edu/duodeksemblema00saub/JP2Processed/duodeksemblema00saub_0009
>  
> can get you different representations of the page image – e.g., image/jp2, image/jpeg, or image/png – in accord with how you have your Accept: header (Content-Type) set.
>  
> The user or client application might want to know this, or at least might want to know that the annotator was annotating the image/jpeg rather than the image/jp2. How should this State information be expressed? Is specifying the MIME type header on its own reasonable, or when talking about State do we expect implementers to be more specific? In other words is State mostly about oa:when and specifying values for HTTP Accept headers, or something more? Assuming proper binding of ex: to a namespace, consider this variant of the above illustration:
>  
> <http://example.org/myAnnotation1> a oa:Annotation  ;
>         oa:hasBody <...>  ;
>         oa:hasTarget <urn:uuid:11111...> .
>  
> <urn:uuid:11111...> a oa:SpecificResource;
>         oa:hasSource <http://emblemimages.grainger.illinois.edu/duodeksemblema00saub/JP2Processed/duodeksemblema00saub_0009 > ;
>         oa:hasSelector < urn:uuid:22222...> ;
>         oa:hasState <urn:uuid:33333...> .
>  
> <urn:uuid:22222...> a oa:FragmentSelector ;
>         rdf:value "xywh=pixel:890,700,200,150" .
>  
> <urn:uuid:33333...> a oa:State ;
>         oa:cachedSource <...> ;
>         ex:acceptContent-Type "image/jpeg" .
>  
> The main question here has to do with the scope of oa:hasState and what is the right way to express that you are annotating a representation of a particular content-type that should be retrieved using content negotiation.
>  
> What are the issues to adding semantics to the OA extension namespace to handle the standard HTTP Accept headers (Accept, Accept-Language, Accept-Encoding, Accept-Charset, leaving off the Accept-Datetime as too close to oa:when)? Does the scope of oa:hasState need to be broader?
>  
> 3.       Pseudo-Fragment Identifiers
>  
> Finally there's a potential 3rd issue here of whether a community that makes extensive use of djatoka would be well advised to extend the OA namespaces with a property(ies) in the community namespace that basically just treats the djatoka OpenURL query string (excluding rft_id value) as analogous to a kind of fragment identifier. Wouldn't be informative to any applications not familiar with djatoka, but it's so obvious, I wonder if it's going to be suggested fairly quickly. If not for djatoka then for something even better suited to this approach, e.g., the IIIF Image API (http://library.stanford.edu/iiif/image-api/). Not altogether coincidentally, the IIIF approach is 1:1 mappable to djatoka OpenURL model. Should this be encouraged or discouraged?
>  
> Thanks for any comments / feedback.
>  
> Tim Cole
> University of Illinois at UC
>
Received on Friday, 6 July 2012 13:33:21 UTC