Re: Annotation Concept vs Document (was Level 1 comments) from Herbert Van de Sompel on 2013-01-09 (public-openannotation@w3.org from January 2013)

From: Herbert Van de Sompel <hvdsomp@gmail.com>
Date: Wed, 9 Jan 2013 08:01:53 -0700
To: Antoine Isaac <aisaac@few.vu.nl>
Cc: public-openannotation <public-openannotation@w3.org>
Message-Id: <128C0D6F-3C31-4BD5-B9F2-594FD9F9266D@gmail.com>
Antoine,

Just a FYI aside:

ORE ResourceMaps were directly inspired by Named Graphs. Actually, versions of the spec prior to the final one explicitly referred to Named Graphs. And Chris Bizer, on elf the NG gurus, was involved in helping us design things the way they are in ORE. References to NG were taken out because readers found it to be yet another complication to a spec they already found complicated. Remember this was 2008 and for many, especially in our initial target community, even the notion of RDF-based modeling was a big new step. 

As Rob indicated, the 303 stuff was a complication our target community found odd and alien. Now, the pattern is widespread. But it still is rather controversial (HTTPRange14).

As I said, this was just an aside. I think you make an interesting point re format migration ...

Cheers

Herbert

Sent from my iPad

On Jan 9, 2013, at 7:44, Antoine Isaac <aisaac@few.vu.nl> wrote:

> Hi Rob,
> 
> Ouch. We're coming to a comment I'd have made only while reviewing 3.2.3!
> 
> I really don't like that an annotation resource is in fact denoting a serialization.
> This puts a big burden on recognizing an annotation after it has passed a data conversion step, which will happen quite often in the kind of interoperability scenarios you're after.
> There is a need for representing an annotation as a more abstract business object, which is "created" by people or smart agents. Of course I understand the need for requirements on the provenance of documents and data sources, but that seems quite distinct (and to me, quite less important).
> 
> I still think that respecting the one-to-one principle is important in these matters: attributing statements (e.g., oa:serializedBy and oa:hasTarget) to one URI, while they belong to different levels, can be very confusing in a paradigm (the Semantic Web one) that expects this kind of mixture not to happen.
> 
> Actually I'd be curious to hear about the feedback you received on ORE ResourceMap. I personally don't think it was technically so bad. My guess is that the negative feedback might have been motivated by the very act of trying to meet a very general requirement (data sources) within a vocabulary designed for a more specific requirement (aggregations). Especially at a time where they were other approaches (SPARQL named graphs) being devised. OK, NG were not a standard then and are still not. But I understand the will of some people to avoid another proposal, possibly difficult to re conciliate with NGs, to emerge, while they were embarking on bringing NGs to the next level.
> 
> 
> To me a good way for handling solution 1 would be for OA to just coin the properties serializedAt and serializedBy and *defer to other 'data provenance' proposals* (NGs, ResourceMaps, PROV...) for how to use them, i.e., on which resource exactly to attach them. Of course we could provide a couple of examples as guidance.
> I suppose you will not like it, but it's quite legitimate given that the solutions at hand at not mature or consensual yet. The community could sort out later, which is the best solution.
> It could also be that different (sub) communities stick to different options. But that can be ok as well: perhaps there is one solution which is perfect for RDF but horrible for another...
> 
> 
> 
> On option 2 or 3: I trust that if there's one resource, then it should mainly denote the more abstract annotation, not the serialization. I think this has less pitfalls for interoperability between applications. If you're searching for a justification: just imagine the kind of horrible questions data consumers will ask about the semantics of oa:equivalent! (whether or not higher-level statements like oa:hasBody or oa:hasTarget statements should be propagated across equivalent annotations -- I believe they should).
> 
> And we could keep the current pattern but updating the semantics of serializedBy to mean something like
> "this resource [which is an 'abstract' annotation]" has been serialized by X"
> as opposed to "this serialization was carried out by X" as I understand the meaning of serializedBy now.
> This property would become a kind of 'shortcut':
> anAbstractAnnotation -serializedBy-> X
> standing for the hypothetical path
> anAbstractAnnotation -hasSerialization-> anAnnotationSerialization -createdBy-> X
> 
> 
> Side question: I'd be curious to hear whether
> oa:Annotation rdfs:subClassOf ore:Aggregation
> holds for you (for me it does!)
> 
> 
> 
> Cheers,
> 
> Antoine
> 
> 
>> 
>> Dear all,
>> 
>> To pick up on one of Antoine's comments in particular:
>> 
>> On Sun, Jan 6, 2013 at 8:47 AM, Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>> wrote:
>> 
>> 
>>    2. "An Annotation is the expression of a relationship between two or more resources in the form of a serialized graph."
>>    I find this confusing. Serialization is a representation in one syntax. This hints that an annotation serialized in RDF/XML is not the same as an annotation serialized in Turtle... I would remove "in the form of a serialized graph".
>> 
>> 
>> That is actually the exact intent. An Annotation is a document, which necessarily has a serialization. Therefore the RDF/XML serialization of a graph, from URI-A, is a different "Annotation" from the same graph serialized in JSON-LD from URI-B.
>> 
>> This is to avoid having to have multiple nodes, one identifying the Annotation and the other identifying the serialization. This was met with large rounds of disdain from the Linked Data community when it was done in the Open Archives ORE spec (Conceptual Aggregation vs Resource Map Document) and necessitates the use of the 303 redirect paradigm.
>> 
>> The two options considered:
>> 
>> (1) Have multiple nodes. One for the serialization, one for the Annotation concept.
>> Costs:
>> * Have to mint and maintain two identifiers. People don't like doing this. Look at the "textual body" discussion!
>> * Have to have a 303 redirection service
>> * Have to include both in the graph in order to have the serializedBy/serializedAt information
>> * Have to have specific instructions as to what to refer to, Concept or Serialization, in further Annotations
>> 
>> 
>> (2) Have a single identity that represents the serialization
>> Costs:
>> * Have to either explain the issue in detail to people who probably don't care, or gloss over it and hope Antoine doesn't notice :)
>> * Have to have serializedBy/At and annotatedBy/At to properly maintain the provenance information
>> 
>> We figured that option (2) was the lesser of the two evils.
>> 
>> The hypothetical option (3) is to have a single identity that represents the concept, but that would be much harder to justify as to why you got a representation from a concept.
>> 
>> Our proposed solution is to keep the text in the introduction as is, but explain the situation further in the Provenance section for people who care about it.
>> 
>> Rob & Paolo
>> 
>> 
> 
>
Received on Wednesday, 9 January 2013 15:02:30 UTC