Re: Streamlining the OA Model from James Smith on 2012-07-31 (public-openannotation@w3.org from July 2012)

From: James Smith <jgsmith@gmail.com>
Date: Tue, 31 Jul 2012 13:01:13 -0400
To: public-openannotation <public-openannotation@w3.org>
Message-Id: <1E4D377C-155E-49FD-BAF4-D1F142C6CB41@gmail.com>
On Jul 31, 2012, at 12:02 PM, Bernhard Haslhofer <bernhard.haslhofer@cornell.edu> wrote:

> Hi, 
> 
> I spent the past couple of weeks implementing the Maphub API (http://maphub.github.com/api/) using the Open Annotation model and found that the model is expressive enough for our use cases. However, I believe that some tweaks in the OA specifications could streamline the model and make the developer's life (= the "users" of the specification) easier, both on the server- and the client-side.
> 
> Here the summary my thoughts:
> 
> 1.) Direct Relationship between Annotation and the Source
> 
> "Give me all annotations for resource X", is probably one of the most important queries that needs to be answered. X could be an image URI, the URI of a video, whatever. Since the the Target of an annotation may be a resource with its own dereferencable URI OR a Specific Target with a UUID node, you need to consider this when formulating a query and end up with a SPARQL UNION query or some conditional node traversal code when using an RDF API.
> 
> Technically, it is of course possible to do that, but given the importance of that query, I would argue that the solution is not very intuitive and maybe also not very efficient. I believe that this can easily be be fixed by introducing a direct relationship property (e.g., oa:annotates, oa:hasTargetSource) between the Annotation and the Source resource.
> 

I don't see a problem with an oa:annotates or similar linking the oa:hasSource with the oa:Annotation node if it makes reasoning simpler in most cases. The other option for reasoners is to go through the RDF store and create this linking in a single pass (where it doesn't exist) and then using it later as needed. Adding this is an optimization. It doesn't enable or break anything that wasn't already enabled or broken.

> 2.) Fragment URIs as Targets
> 
> In our API (the GeoReference part) we followed the OA recommendation and used a Specific Resource and a Fragment Selector to express that a URI annotates an XY point on a raster image. We could express the same information by using W3C Media Fragments and thereby reduce the verbosity and complexity of the resulting serialization. API consumers then don't even need to know about OA-specific "Specific Resources", "Fragment Selectors", etc.
> 
> The Open Annotation model currently does NOT RECOMMEND the use of fragment URIs for identifying segments of Targets or Bodies for three reasons (see 5.2.1):
> 
> - "cannot query the source directly": I think this could and should be solved by considering (1.)
> - "they are not compatible with State and Style Specifiers; many annotations may have the same segment of interest, but have different States and Styles": from previous emails and discussions I understood that Styles should be directly attached to the Annotation, which also means that that they are contextualized and not an argument against fragment URIs anymore. I think that sth. similar can be done with "State" and would also result in a more consistent model and allow for fragment URIs
> - "Fragment URIs conflate the identity and the description of the segment of interest by including the description inline within the identity": I am not sure if I get the point of this argument right; however, I believe that for very practical reasons the OA model should reuse what other specifications (Web Architecture, Media Fragment RFCs) already define; this brings modularity and flexibility and avoids the risk of re-designing what others already did elsewhere.
> 
> I think the benefits of reusing (Media) Fragment URIs in OA prevail the arguments of not using them and therefore I propose to RECOMMEND the use of Fragment URIs and only fall-back on OA-specific Selectors if Fragment URIs are not expressive enough.

-1 from me for this. URIs are opaque. They have no semantics in the context of RDF. Clients should only have to use them as part of a protocol request (e.g., an HTTP GET).

What OA does is break the URI+Fragment into two pieces: the URI, which can be dereferenced, and the Fragment, which the client can parse and use. The target is broken into a server component and a client component, since the server should never see a fragment identifier, and the client should never parse the URI. Putting the two together requires the client to parse the URI and the server to know that it has to ignore anything in the URI that follows the hash (#). This could break some servers (testing required, but servers don't see a hash+Fragment from browsers, so they might be designed not to expect it).

> 3.) Simple Literal Body Shortcut
> 
> I understand that an OA annotation is a relationship between resources (the body and the target) and that inline bodies are represented using the Content in RDF specification (see 6.1.). However, our own demonstrator and also the majority of use cases demonstrated in the OAC meeting last week showed that many annotation bodies are simply strings, which could be represented as literals.
> 
> Therefore I am proposing to introduce a "shortcut" property between the Annotation and the "content" Literal (e.g., hasLiteralBody). This allows people to express simple annotations in a, in my opinion, more straightforward way and doesn't contradict the current oa:hasBody approach.
> 

-1 from me. I don't see how this isn't a premature optimization. The examples at the OAC meeting used the simplest possible body, so I don't expect them to be representative of usage "in the wild." With more data from live, publicly available projects where the optimization represents a significant gain, I might be persuaded otherwise.

In my own experience, refactoring should make literals easy regardless of how many triples are needed in the RDF. Same for retrieving literals from the RDF. OA is an exchange format. I expect application writers to create libraries that wrap the RDF nature of the model so they can easily get/set the information they are interested in.

> 4.) Style Attached directly to the Annotation
> 
> We don't express style information in our serializations because I believe that styling information and data representation should be separated. However, I understand that there are use cases that require this feature and I prefer the approach of optionally attaching style directly to the annotation over attaching it to the Specific Target.
> 

+1 from me. I think it makes more sense for the style to affect the annotation than the target.

> 5.) JSON (-LD) Serialization Recommendation
> 
> At the moment the spec recommends that RDF/XML is used as default serialization language. We haven't implemented it yet, but I'd consider JSON (-LD) at least as alternate "default" serialization format to open the door for JS clients.

I'm on the fence with this. I definitely think JSON is outgrowing XML as a serialization standard on the web. JSON is designed for data structures. XML is designed for documents. RDF is more a data structure than a document. Any JavaScript libraries I produce will work with RDF/JSON and let the user add in any RDF/XML support they might need. My server implementations will produce RDF/JSON and probably RDF/XML (one is easy once you have the other). In my own shared canvas work recently, I'm producing RDF as JSON, XML, and Turtle, as well as a HATEOAS-oriented JSON. In the world of linked data and REST, the important thing for OA is the data model, not the serialization format.

-- Jim
Received on Tuesday, 31 July 2012 17:01:47 UTC