- From: Doug Schepers <schepers@w3.org>
- Date: Tue, 18 Aug 2015 19:26:51 -0400
- To: Robert Sanderson <azaroth42@gmail.com>
- Cc: t-cole3 <t-cole3@illinois.edu>, Ivan Herman <ivan@w3.org>, W3C Public Annotation List <public-annotation@w3.org>
Hi, Rob– I'm going to propose a compromise. I don't think either of us will love it, but I hope we can both live with it. I'm going to use "object"/"property" terminology, but feel free to reformulate into RDF terminology. 1) We allow (but don't require) an "id" property on each object, to make it addressable; 2) We strive for a single consistent structure that applies equally to Body, Target, Tag, and so on, modulo the proposal in #3. 3) We make the nested "source" object a SHOULD, while the empty-node construct is only a MAY, and only allowed for text-literal resources, and (maybe?) only in a non-Linked Data context; we define a clear equivalence mapping. For example, these statements would be equivalent: "body" : { "role" : "tagging", "source" : { "value" : "+1" } } "body" : { "role" : "tagging", "value" : "+1" } But these would not: "body" : { "role" : "linking", "source" : { "type": "Image", "id": "http://example.com/image.png" } } "body" : { "role" : "linking", "type": "Image", "id": "http://example.com/image.png" } 4) We define the default type of "value" to be "text". We require explicit "type" values for all other datatypes. For example, these statements would be equivalent: "body" : { "role" : "commenting", "source" : { "type" : "text", "value" : "This reminds me of a meme…" } } "body" : { "role" : "commenting", "source" : { "value" : "This reminds me of a meme…" } } "body" : { "role" : "commenting", "value" : "This reminds me of a meme…" } 5) We continue to discuss property names that might be more intuitive. For example, I find "source" less clear than "content", and I'd like to see different proposals for the terms "EmbeddedContent" and "SpecificResource". Thoughts? Regards– –Doug On 8/18/15 3:57 PM, Robert Sanderson wrote: > > > On Tue, Aug 18, 2015 at 12:01 PM, Doug Schepers <schepers@w3.org > <mailto:schepers@w3.org>> wrote: > > Hi, Rob– > On 8/17/15 2:41 PM, Robert Sanderson wrote: > > On Mon, Aug 17, 2015 at 10:13 AM, Timothy Cole wrote: > Now that resource has two roles, tagging and commenting. > > > Can you please describe again (I feel you've mentioned it before) > the use case for this 'body' reuse? Especially in the case where the > body is a text literal? > > > Sure. > > As a reviewer using an annotation tool to comment on a paper, I want my > review to be persistent and referenced. It might refer to other papers > beyond the one I'm commenting on, for example to point out plagiarism or > to suggest other sources, but the review is of the target paper. > > As a paper author, I want to link that review to my cited paper as a > justification for its value. E.g. the same content reviews one paper > and provides support for another paper. > > The review starts off as a block of text in an annotation client. It is > then transferred via the protocol to a server. The server creates a URI > for it. > The second annotation takes that URI and uses it as the body, with a > different role. > > > And another: > > I post on twitter a comment noting a typo on a wikipedia page. > A system then uses that more specifically as the justification for an > annotation that also suggests the change, both using more specific > motivations. > > > And another: > > I post on medium my thoughts about a particular politically charged > topic. It's a comment on a wikipedia page. > People on both sides of the topic take the same post different ways and > use it as support for their view and a dismissal of the opposition. > > > And another: > > I transcribe a quote from a book as part of a crowd-sourcing platform. > I then use that quote as a comment on the museum exhibit that it is > talking about. > > I can go on if needed. > > Is this use case common, or is it an edge case? > > > Common. > > I'm having a hard time imagining a large-scale annotation > application that would reuse body literals, rather than simply > having multiple instances of similar bodies, each contained in its > own annotation. The user experience and workflow aren't clear to me. > > > When the authorship of the body is important. Which is almost always. > Note that the author of the body is not necessarily the author of the > annotation, as per the examples above, bar the last one. > > > I totally understand that multiple annotations might use the same > external resource (e.g. a picture or video) as a body, but that's a > different case with a different object structure (and a different > UX/workflow). > > > All of the above *start* as plain text, so the same UX for the first > part. The second annotation doesn't need to re-type the text, rather > than selecting existing content. So I think I agree that there is a > different workflow, even if the same UI might allow both. > > However I disagree that there must be a different structure. Having a > consistent structure for both uses -of the same body- seems important, > as clients and servers will otherwise need to implement both, depending > on the otherwise arbitrary order in which the annotations were created. > > > At some point, if you're pointing to 2 different external resources, > it seems like it would be hard to delineate between an annotation > with multiple targets (or bodies), rather than a clear body-target > relationship, and I don't see what kind of annotation client would > structure things that way. > > > I don't understand this, sorry. > > I assume that your annotation client does something like this… can > you tell us how that works? > > > And I'm not sure what you're asking for here. > > > > Rather than consistently using the Specific resource pattern: > > "body": { > "role": "tagging", > "source": { > "id": "http://repo.org/bodies/1", > "value": "+1" > } > } > > Which will always work at the (IMO minimal) cost of slightly > more structure. > It's also clearer without the explicit types, as role can only be on > SpecificResource. > > > Is this structured allowed, or required? If it's simply allowed, > then we agree. If it's required, then I'm a bit less comfortable. > > > It would be required for resources with URIs. I would prefer to require > it also for Embedded content for consistency, and to keep the separation > of concerns per my response to Tim. > > > When we extrapolate to multiple bodies (which is really what we're > talking about), the extra code become more obvious: > > "body" : [ > { "role" : "tagging", "value" : "+1"}, > { "role" : "commenting", "value" : "This reminds me of a meme…" }, > { "role" : "linking", "source" : "http://example.com/image.png" } > ] > > (Fixed and compacted inline) > > versus: > > "body" : [ > { > "role" : "tagging", > "source" : { "value" : "+1" } > }, > { > "role" : "commenting", > "source" : { "value" : "This reminds me of a meme…" } > }, > { > "role" : "linking", > "source" : "http://example.com/image.png" > } > ] > > (Fixed and compacted inline) > > But in most cases: > > "body" : [ > { > "role" : "tagging", > "source" : { "value" : "+1" } > }, > { > "role" : "commenting", > "source" : { "type" : "text", "value" : "This reminds me of a meme…" } > }, > { > "role" : "linking", > "source" : { "type": "Image", "id": "http://example.com/image.png" } > } > ] > > At that point, it's not clear what this structure buys us, though > I'll admit that it adds a uniformity of structure between constructs > of different types might make it easier to always do the right thing. > > > Uniformity in data structures is good, rather than constantly having to > test for the existence of different structures. Also in terms of making > it easier to do the right thing, and the actual complexity of the > structure, if you have to explain one thing well, that's easier than > explaining two things well plus when you would choose to use one or the > other. > > Especially when you have to understand and implement either both anyway, > or just one. > > > That is why I'm +0, rather than -1. I can live with it if > needed, but I > think there's a better way that separates the two concerns: > > EmbeddedContent: Transfer content of any type for any resource, > URI or > no, in the serialized annotation. (Which is why we talked about > it in > the Serialization section in the CG docs) > SpecificResource: Make annotation specific assertions about a > Body or > Target resource. (Until now, that has been selector, state, > style and > scope ... we're just adding another specifier of role) > > > Perhaps the terms "EmbeddedContent" and "SpecificResource" are > throwing me off a bit. Are those terms used in LD/RDF, or are they > terms we've introduced? > > > We introduced both. > > We (the WG) introduced EmbeddedContent to replace the defunct > ContentAsText work, after many failed efforts to get the people > responsible for it to take it forwards. > http://www.w3.org/TR/Content-in-RDF10/ > > And (as earlier in the thread) we (the Open Annotation Collaboration, > pre CG) introduced Specific Resource based on Tim Berners-Lee's notion > of Specific vs Generic resources in the web architecture, previously > called Constrained resources. > > > Hope that helps, > > Rob > > -- > Rob Sanderson > Information Standards Advocate > Digital Library Systems and Services > Stanford, CA 94305
Received on Tuesday, 18 August 2015 23:26:55 UTC