RE: Examples towards Embedded Body discussion

In some of the CG discussions about the issue of embedded bodies in annotations, additional facets were raised that may have a bearing here:

 

1.      For consuming applications –  If literal annotation bodies (and targets) are supported, is there any advantage for consuming applications to include semantics that differentiate between bodies that are literals and bodies that are not – e.g., two distinct predicates – hasBody, hasLiteralBody?  I think the consensus of the CG was going towards this not being a good idea, but this question was never fully resolved since the CG decided to use the content in RDF draft to handle embedded literals.

2.      For creating applications – If literal annotation bodies are allowed, do we still need a way for annotation authoring applications to sometimes embed a literal so as to be able to attach properties to the literal, e.g., as in Rob’s example 7 (or even potentially his examples 2, 3, 5, and 6)?   Given that the content in RDF spec has not advanced, if these are valid use cases we may still need to provide semantics to do this even if the more common methods allowed and used by publishers are along the lines of what Ivan suggested.

 

As Jacob suggests, once this is settled for bodies and targets, we may need to revisit other properties.

 

Tim Cole

University of Illinois at UC

 

 

 

From: jgjett@gmail.com [mailto:jgjett@gmail.com] On Behalf Of Jacob Jett
Sent: Tuesday, October 28, 2014 8:37 AM
To: Ivan Herman
Cc: Robert Sanderson; W3C Public Annotation List
Subject: Re: Examples towards Embedded Body discussion

 

While it isn't perfectly synonymous with Ivan's example, the 'hasStyle' predicate, i.e., the highlighting use case, already does something like this (the old "body-less" annotation). If we relax the annotation requirements in the way that Ivan suggests, is it possible to also collapse the 'hasStyle' predicate into a similar solution? i.e., Is there a way to get json to deliver a payload of css using something like @value?

 

My apologies if this seems like a naive question. I've just begun my explorations of the json family of standards.

 

Regards,

 

Jacob

 




_____________________________________________________

Jacob Jett
Research Assistant
Center for Informatics Research in Science and Scholarship
The Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA
(217) 244-2164
jjett2@illinois.edu <mailto:jjett2@illinois.edu> 

 

On Tue, Oct 28, 2014 at 7:32 AM, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org> > wrote:

Rob,

See comments below...


On 27 Oct 2014, at 21:11 , Robert Sanderson <azaroth42@gmail.com <mailto:azaroth42@gmail.com> > wrote:

>
> Examples towards a discussion on the topic tomorrow morning at TPAC:
>
> 1.  Just a string literal:
>     {"hasBody": "This is the comment"}
>
> 2.  String literal and Language
>     {"hasBody": {"@value" : "This is the comment",
>                          "@language": "en"}}
>
> 3.  String literal and format as data type
>     {"hasBody": {"@value": "<span>This is the comment</span>",
>                          "@type": "rdf:HTML"}}
>
> 4.  Embedded string, as a resource:
>     {"hasBody": {"rdf:value": "This is the comment"}}
>
> 5.  Embedded String and format as media type
>     {"hasBody": {"rdf:value": "<span>This is the comment</span>",
>                          "dc:format": "text/html"}}
>
> 6.  Embedded String and language as property rather than tag
>   {"hasBody": {"rdf:value": "This is the comment",
>                        "dc:language": "en"}}
>
> 7.  Embedded String, format and language
>    {"hasBody": {"rdf:value": "<span>This is the comment</span>",
>                         "dc:format": "text/html",
>                         "dc:language": "en"}}
>
> 8a.  Simple URI when string literals are not allowed (4-7)
>     {"hasBody" : "http://example.org/index.html"}
>
> 8b.  Simple URI when string literals are allowed (1-3)
>     {"hasBody" : {"@id": "http://example.org/index.html"}}
>
>
> Notes:
> * 7 cannot be done with @value/@type/@language, as RDF does not allow datatype and language tag on the same literal.  Thus 7 is the only possible model for when all three are required at once.
> * 3 requires a URI for the format, whereas 5 and 7 require a media type registration.  Some content may have neither, such as Markdown.
> * For 1,2 and 3 the body is a literal. For 4, 5, 6 and 7 the body is a resource.  Literals cannot have other properties associated with them, such as creator, created date, or other provenance, and thus these must use the resource pattern.  When the literal is used by itself (1-3), *it has no provenance information* beyond that of the graph, which is likely not correct.
> * If the value of the string in 1-3 was a URI, then it is *not* the resource identified by that URI, it is just a string that happens to look like a URI.  For the URI case, it would have to be as per 8b.
>
>
> The consideration is, in my opinion:
>
> Does the simplicity of 1 outweigh the complexity of having to deal with all of the options, and especially requiring the structure always be present when the body is a resource with its own URI? If String literals are not allowed, then the consistent pattern is that of 4 through 8a.  If they are, then the client must deal with all 1-7 plus 8b.

I am arguing for the necessity to allow for pattern #1. Putting my CSVW WG member's hat on:-): I think the issue is that the annotation may be human edited. Let me also clarify/describe the use case for those who are not familiar with the background. Sorry if it is a bit longish.

The CSVW WG defines metadata for CSV files. Ie, alongside the CSV content proper, CSV data publishers would/could produce a separate file that describes the data, providing information like creation dates, structure of the data (column and, possibly, row names, data types for columns or rows or for individual cells, etc.). The metadata is a JSON file. If you are interested, the latest version is at [1]. What is important to note here is that the file is not (necessarily) machine generated, but written by humans, possibly using a simple text editor.

The metadata includes a term "notes". This may include information like, say, the name of a statistical method used to generate a particular row, the name of the satellite that produced the meteorological information in a column, that sort of things. These are clearly annotations, and we would like to make it OA compatible.

There is an RFC for fragment ID-s in CSV files, so anchoring is well defined (although it is not robust, but let us put that aside for now). The current OA would mean something like:

"notes" : [{
                "hasTarget" : "URIforCSV#row=1234"
                "hasBody"   : { "@value" : "My favourite stats method is used for this" }
        },
        ...
        }]

While, of course, there may be annotations that require more complex bodies, and then the structure above is o.k., it would be a really hard call to convince people to use the structure above instead of the more obvious:

"notes" : [{
                "hasTarget" : "URIforCSV#row=1234"
                "hasBody"   : "My favourite stats method is used for this"
        },
        ...
        }]

I am almost sure that most of the data publishers will get this wrong and will simply do it the simple way which, let us face it, is the 'usual' JSON way (ie, to have either a string or an object in such a situation). I also believe this is not specific to CSV metadata: the same situation will arise in all situations where the annotation/note is produced by a human and not by some sort of an application. Hence the need, in the CSVW WG's view at least, to make that type of structure o.k. for OA as well.

(The term 'hasBody' in this context is not that intuitive either, b.t.w., we may think in using some alias.)

As for "simplicity of 1 outweigh the complexity of having to deal with all of the options": the question is indeed to ask whether the "simplicity of 1 _for users_ outweigh the complexity _for implementers_ having to deal with all the options". Putting it this way the answer seems to be clear to me: we should definitely allow for option 1...

B.t.w., to turn more technical: from an RDF point of view, it means relaxing the requirements, ie, that 'oa:hasBody' should not be defined as an object property (which is an OWL notion anyway, RDF does not have this). Meaning that its value is simply defined as an RDF Resource (a Literal is also an RDF Resource). The only consequence is that the OA data would not be OWL DL compliant (OWL DL requires a strict separation of object and data properties, and even OWL 2 DL's new punning features do not help in that). The question is whether it is a requirement that OA data should be usable for DL reasoners. Personally, I do not think that should be a requirement, and we can also simply make it clear in the documentation that if somebody uses that type of punning (ie, 'hasBody' with literal value) then the data is not DL compatible. But it should still be 'legal' OA data.

Another possibility (I am making this up while writing this...) is that the current

        annotation->body->hasBody->@value
        annotation->target->hasTarget->

'triangle' could be relaxed, conceptually, for a simple case where the body is really a simple literal into something like

        annotation->@value
        annotation->target->hasTarget->

ie, that the separate resource for a body may be missing altogether to be replaced by a direct value. In JSON terms, our example would then become something like:

"notes" : [{
                "hasTarget" : "URIforCSV#row=1234"
                "@value"    : "My favourite stats method is used for this"
        },
        ...
        }]

I am sorry if this turned out to be a bit long. Unfortunately, I cannot be at the F2F later today...

Thanks!

Ivan

[1] http://w3c.github.io/csvw/metadata/index.html


>
>
> Thanks, and see many of you tomorrow :)
>
> Rob
>
> --
> Rob Sanderson
> Technology Collaboration Facilitator
> Digital Library Systems and Services
> Stanford, CA 94305


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153 <tel:%2B31-641044153> 
GPG: 0x343F1A3D
WebID: http://www.ivan-herman.net/foaf#me






 

Received on Tuesday, 28 October 2014 15:40:59 UTC