Re: [web-annotation] Define json-ld profile URI for OA serialization context and structure from Ivan Herman on 2015-04-12 (public-annotation@w3.org from April 2015)

From: Ivan Herman <ivan@w3.org>
Date: Sun, 12 Apr 2015 16:15:16 +0200
To: Robert Sanderson <azaroth42@gmail.com>
Cc: Ivan Herman via GitHub <sysbot+gh@w3.org>, W3C Public Annotation List <public-annotation@w3.org>
Message-Id: <80910E13-E4AF-4761-801E-13E108A0ACC9@w3.org>
Thanks Rob, I understand.

Using the JSON-LD mechanism is of course doable and probably the right way to do it.

However, let me raise a (small:-) red flag here: we should not really encourage the creation of such profiles imho, only if they are really necessary for a community. If we have a proliferation of those profiles, and systems want to exchange annotation data, what this means is that those systems would have to have a full JSON-LD implementation at disposal. There aren't that many of those around yet, and I think it would be good if various implementations could rely on a flavor that we define and which, though fully compatible with JSON-LD, does not make it a requirement to use a JSON-LD tool. (As you say yourself, many algorithms in the JSON-LD world are not (yet) normative, I have no idea whether they are stable, for example.)

Cheers

Ivan

P.S. We had this issue in the CSVW WG for the metadata. The way the group defines the metadata (in JSON) is trying to do what I described: be compatible with JSON-LD, ie, the metadata can be used by an RDF toolkit, but the metadata can be processed "directly", so to say, without using a JSON-LD parser. I have the impression that our model document does the same, in fact. Whether we define it as a different profile or a different media type is, in this respect, a secondary issue.


> On 10 Apr 2015, at 17:25 , Robert Sanderson <azaroth42@gmail.com> wrote:
> 
> 
> Hi Ivan,
> 
> On Fri, Apr 10, 2015 at 12:52 AM, Ivan Herman via GitHub <sysbot+gh@w3.org> wrote:
> I would like to understand it more. Do you mean that there will be
> different flavors of JSON-LD using the annotation model, or that the
> JSON-LD used *for* the encoding of the annotation model is, *by
> itself* a flavor of JSON-LD and we have to allow for content
> negotiations to find that out?
> 
> Both are possible, and hence identification is important.
> 
> In terms of the flavors of JSON-LD, there are several dimensions:
> 
> * Different context documents will result in different keys being used in the serialization to represent the same predicates in the vocabulary. As it's a trivial transformation to move between contexts, one that's already implemented in all of the JSON-LD libraries (AFAIK), systems can easily switch contexts to the one that's native to them.
> As a concrete example, the IIIF community wanted a less RDF-y set of JSON keys for using annotations to link images [1], along the lines of issue #12 [2].
> So we reused the same model and vocabulary, but included new mappings in the IIIF context.
> 
> So for a system that supports the model and vocabulary, it's easy to support multiple communities' serializations if those serializations are just context switching.  However the system needs to know which of those serializations to use on any given request, including creation, updating and retrieval.  This can be done by introspection on the content to look at the @context, but that doesn't determine the shape of the JSON, just the keys used.
> 
> * Different structures for the same context are also possible.  The (non-normative) framing part of JSON-LD can specify the structure, or like in the model document, it can be done at the human documentation level.  For us it makes sense that Annotation is the root node in the JSON tree, but from an agent identity management system's perspective, the Annotator should be the root node, and the annotation is just something that the person has created.
> 
> * Different levels of embedding are possible.  If you want JUST the annotation node in the graph, and only references to the body, target and other linked resources, that could still be the same context and structure, just a different way of partitioning the graph.  Embedding or not can also be expressed in frames at the granularity of resources, but not individual properties.
> 
> * There are also several document forms defined in the specification, including:
>   + Expanded -- take the context and resolve all of the mappings so everything is a URI
>   + Compacted -- the reverse of expanded, without a specification about the structure
>   + Flattened -- instead of a nested tree, every resource is separated into a list.  The equivalent of striped RDF/XML
> 
> 
> To resolve this (as per the links in the issue), the media type for JSON-LD defines a parameter called "profile" that is a whitespace separated list of URIs.  Each URI is a profile that specifies some set of constraints on the above.  By putting it in a parameter, clients can still key off of the application/ld+json media type to understand what they have is JSON-LD, but specific clients can use the profile information for potentially more efficient or different processing.  It's a solution that we didn't have in the XML world, to our great detriment, for determining the schema and its version; everything is just application/xml to avoid proliferating the number of media type registrations.  Imagine if there was application/schema-version+xml for every version of every XML schema!
> 
> By having it on the media type, it allows the client to use content negotiation to request a particular flavor of JSON-LD serialization of the same resource, or to specify the flavor that it's using to POST/PUT when creating/updating a resource.  It allows the server to specify on the response what flavor it is returning.  And it doesn't necessarily involve registering anything with IANA, though I think that we should do that as good information standards community citizens.
> 
> [1] http://iiif.io/api/presentation/2.0/#image-resources
> [2] https://github.com/w3c/web-annotation/issues/12
> 
> 
> If the latter: note that the CSVW Working Group defines a [metadata
> format for CSV files](http://w3c.github.io/csvw/metadata/index.html)
> which also restricts JSON-LD in some sense. Ie, it is a different
> 'flavor', so to say. The decision we took was to simply define a
> different media type, derived from JSON; that may be more flexible for
>  clients that want JSON and they do not care about the relationship to
>  JSON-LD. I am not saying this is what we should do, but it may be a
> data point to consider.
> 
> -1 to defining new media types when a solution already exists.  Browsers and other clients would then need to process n different media types that are all really just JSON, and really just JSON-LD.  There's already a big enough issue with application/json and application/ld+json in terms of browser and web server (eg mime magic mapping of .json and .jsonld) support.   Consider XML schemas, or all of the many many many plain JSON API formats out there already -- if everyone were to take that approach the media type list would be inflated many-fold and implementers would throw up their hands in despair.
> 
> Rob
> 
> --
> Rob Sanderson
> Information Standards Advocate
> Digital Library Systems and Services
> Stanford, CA 94305


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Sunday, 12 April 2015 14:15:25 UTC