Re: My thoughts on the multi-body alternatives (as shown on Tim's wiki page) from Ivan Herman on 2015-08-19 (public-annotation@w3.org from August 2015)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 19 Aug 2015 07:14:22 +0200
To: Doug Schepers <schepers@w3.org>
Cc: Robert Sanderson <azaroth42@gmail.com>, Tim Cole <t-cole3@illinois.edu>, W3C Public Annotation List <public-annotation@w3.org>
Message-Id: <DA85B7EE-517E-4B73-8981-F96FB3DC13F6@w3.org>
> On 19 Aug 2015, at 06:06 , Doug Schepers <schepers@w3.org> wrote:
> 
> Hi, Rob–
> 
> I'm glad we seem to be converging on some sort of agreement. I hope others feel the same.
> 
> 

+1

> I'd like to be explicit about one more related thing from the spec; as we've tossed around these examples, I don't know if there was an understanding about "EmbeddedContent" that was elided.
> 
> Currently, the spec says [1]:
> 
> [[
> If the requirements for a simple textual Body are not met, and the representation of the Body is to be embedded within the Annotation's serialization, then the Body MUST be a resource and MUST have the class oa:EmbeddedContent. The content of the Body is recorded as the value of the rdf:value property, and additional properties such as dc:format and dc:language SHOULD be given if known.
> ]]
> 
> And here are some examples from the spec:
> 
>  "body": {
>    "@type" : "oa:EmbeddedContent",
>    "value" : "content",
>    "format" : "text/plain",
>    "language" : "en"
>  }
> 
> 
>  "body": {
>    "@type": [ "oa:Tag", "oa:EmbeddedContent" ],
>    "value": "paris"
>  }
> 
> Does the "type":"EmbeddedContent" really need to be explicit? Or can it be inferred and left out? I guess the same question goes for "SpecificResource".
> 
> 

I was asking similar question through the github issue[1]. But that becomes more of a question if we go for the approach you are taking in your proposal. If my understanding is correct in what you are proposing, the structure of a body SHOULD be always with a nested resource; putting it another way, I could regard (almost) all bodies as being Specific Resources. Ie, it may not be necessary to call out that typing explicitly indeed. I believe the same holds for Embedded Content.


[1] https://github.com/w3c/web-annotation/issues/61


> In general, I'd like to trim out everything that can be trimmed out while still making sense.
> 

+1. I think this is an action we should do, ie, go through the model to systematically do this.

> 
> As another example of this, if we have a selector defined in the data model, do we really need to include the "conformsTo" property in a Selector object? Or is that something that is only defined in the @context, or better yet, in the spec? Is there such a thing as "implied" properties (like defaults) in JSON-LD/RDF?
> 
> 
> I'd like for us to clearly define what properties each type of object requires, which are allowed, and which (if any) are implied.
> 

+1

Ivan

> [1] http://www.w3.org/TR/annotation-model/#embedded-textual-body
> 
> Regards–
> –Doug
> 
> 
> On 8/18/15 8:06 PM, Robert Sanderson wrote:
>> 
>> On Tue, Aug 18, 2015 at 4:26 PM, Doug Schepers <schepers@w3.org
>> <mailto:schepers@w3.org>> wrote:
>> 
>>    1) We allow (but don't require) an "id" property on each object, to
>>    make it addressable;
>> 
>> 
>> +1. This isn't a change, and we can't require that resources do not have
>> identity, so I don't see that there's any other option.
>> 
>>    2) We strive for a single consistent structure that applies equally
>>    to Body, Target, Tag, and so on, modulo the proposal in #3.
>> 
>> 
>> +1. One simple way for everything is better than two simple ways, but
>> one simple way for some things and one complex way for the minority of
>> cases is better than one complex way for everything that then doesn't
>> get used.
>> 
>>    3) We make the nested "source" object a SHOULD, while the empty-node
>>    construct is only a MAY, and only allowed for text-literal
>>    resources, and (maybe?) only in a non-Linked Data context; we define
>>    a clear equivalence mapping.
>> 
>> 
>> Sure. +0.9
>> 
>> Sub proposals:
>> 
>> 3.1) I'd go further to be explicit that the nesting is for consistency
>> to make developers' lives easier.  [one way > two ways, per 2]
>> 
>> 3.2) We could also define a subClass of
>> the-class-currently-known-as-EmbeddedContent for Bodies that allows
>> roles to be associated with it, which would make it clear that embedded
>> stylesheets do not get roles.  In the JSON, they could then have
>> different type property values for clarity, if needed.  [Be explicit and
>> rigorous in the model, hide the details from developers unless needed]
>> 
>> 3.3) We should explicitly state that systems MAY translate between
>> equivalent structures as desired, similar to the ability to translate
>> from a literal body to a body as a resource... so even though you sent
>> in one form, the receiving system may process it into another form.
>> 
>> Essentially the "Role Attached to SpecificResource or EmbeddedContent"
>> option, with a preference for using SpecificResource.
>> 
>> 
>>    4) We define the default type of "value" to be "text". We require
>>    explicit "type" values for all other datatypes.
>> 
>> 
>> If there was a new EmbeddedTextualBody class per 3.2, it wouldn't be
>> necessary to have a default.  Systems could infer that specific class
>> for bodies that have a value property.
>> 
>> 4.1) If we wanted to be very explicit, we could define our own property
>> just for this that wasn't "value".
>> 
>> {
>>  "body": {
>>     "role": "tagging",
>>     "content": { "text" : "+1" }
>>   }
>> }
>> 
>> As opposed to value for things that aren't embedded representations of
>> resources, such as in FragmentSelector:
>> 
>> "body" : {
>>     "role": "commenting",
>>     "selector": {"type": "FragmentSelector", "value": "xywh=0,0,100,100"},
>>     "content": "http://some.url/image.jpg"
>> }
>> Reuse when it makes sense ... and maybe it doesn't make sense here. The
>> original ContentAsText spec invented two such properties (bytes and
>> chars) so we would still be improving the situation!
>> 
>> So I'm +0 on 4 ... because I think it's unnecessary given 3.2 and 4.1?
>> But let us know what you think.
>> 
>> 
>>    5) We continue to discuss property names that might be more
>>    intuitive. For example, I find "source" less clear than "content",
>>    and I'd like to see different proposals for the terms
>>    "EmbeddedContent" and "SpecificResource".
>> 
>> 
>> Sure. Naming, as opposed to correct structure and semantics, is the
>> least of my concerns :)  You can call it a
>> F5D28594-021D-426A-B169-A1E8167D5BA6 if you want :) [As editor, I'd
>> prefer you didn't tho!] The advantage of JSON-LD is that others who
>> really prefer a different name can still call it whatever they want too,
>> by defining a different context for the same properties and relationships.
>> 
>> Thank you very much for the concrete proposal Doug, it's greatly
>> appreciated, and I hope that the above agreement is encouraging to everyone.
>> 
>> Rob
>> 
>> --
>> Rob Sanderson
>> Information Standards Advocate
>> Digital Library Systems and Services
>> Stanford, CA 94305
> 


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Wednesday, 19 August 2015 05:14:33 UTC