Re: [web-annotation] Define json-ld profile URI for OA serialization context and structure from Robert Sanderson on 2015-04-10 (public-annotation@w3.org from April 2015)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Fri, 10 Apr 2015 08:25:30 -0700
To: Ivan Herman via GitHub <sysbot+gh@w3.org>
Cc: Web Annotation <public-annotation@w3.org>
Message-ID: <CABevsUH+Rzd_mmM2-2otu-qN-2S501MWdiHqQedNEYYjQCoDFw@mail.gmail.com>
Hi Ivan,

On Fri, Apr 10, 2015 at 12:52 AM, Ivan Herman via GitHub <sysbot+gh@w3.org>
wrote:

> I would like to understand it more. Do you mean that there will be
> different flavors of JSON-LD using the annotation model, or that the
> JSON-LD used *for* the encoding of the annotation model is, *by
> itself* a flavor of JSON-LD and we have to allow for content
> negotiations to find that out?
>

Both are possible, and hence identification is important.

In terms of the flavors of JSON-LD, there are several dimensions:

* Different context documents will result in different keys being used in
the serialization to represent the same predicates in the vocabulary. As
it's a trivial transformation to move between contexts, one that's already
implemented in all of the JSON-LD libraries (AFAIK), systems can easily
switch contexts to the one that's native to them.
As a concrete example, the IIIF community wanted a less RDF-y set of JSON
keys for using annotations to link images [1], along the lines of issue #12
[2].
So we reused the same model and vocabulary, but included new mappings in
the IIIF context.

So for a system that supports the model and vocabulary, it's easy to
support multiple communities' serializations if those serializations are
just context switching.  However the system needs to know which of those
serializations to use on any given request, including creation, updating
and retrieval.  This can be done by introspection on the content to look at
the @context, but that doesn't determine the shape of the JSON, just the
keys used.

* Different structures for the same context are also possible.  The
(non-normative) framing part of JSON-LD can specify the structure, or like
in the model document, it can be done at the human documentation level.
For us it makes sense that Annotation is the root node in the JSON tree,
but from an agent identity management system's perspective, the Annotator
should be the root node, and the annotation is just something that the
person has created.

* Different levels of embedding are possible.  If you want JUST the
annotation node in the graph, and only references to the body, target and
other linked resources, that could still be the same context and structure,
just a different way of partitioning the graph.  Embedding or not can also
be expressed in frames at the granularity of resources, but not individual
properties.

* There are also several document forms defined in the specification,
including:
  + Expanded -- take the context and resolve all of the mappings so
everything is a URI
  + Compacted -- the reverse of expanded, without a specification about the
structure
  + Flattened -- instead of a nested tree, every resource is separated into
a list.  The equivalent of striped RDF/XML


To resolve this (as per the links in the issue), the media type for JSON-LD
defines a parameter called "profile" that is a whitespace separated list of
URIs.  Each URI is a profile that specifies some set of constraints on the
above.  By putting it in a parameter, clients can still key off of the
application/ld+json media type to understand what they have is JSON-LD, but
specific clients can use the profile information for potentially more
efficient or different processing.  It's a solution that we didn't have in
the XML world, to our great detriment, for determining the schema and its
version; everything is just application/xml to avoid proliferating the
number of media type registrations.  Imagine if there was
application/schema-version+xml for every version of every XML schema!

By having it on the media type, it allows the client to use content
negotiation to request a particular flavor of JSON-LD serialization of the
same resource, or to specify the flavor that it's using to POST/PUT when
creating/updating a resource.  It allows the server to specify on the
response what flavor it is returning.  And it doesn't necessarily involve
registering anything with IANA, though I think that we should do that as
good information standards community citizens.

[1] http://iiif.io/api/presentation/2.0/#image-resources
[2] https://github.com/w3c/web-annotation/issues/12


>
> If the latter: note that the CSVW Working Group defines a [metadata
> format for CSV files](http://w3c.github.io/csvw/metadata/index.html)
> which also restricts JSON-LD in some sense. Ie, it is a different
> 'flavor', so to say. The decision we took was to simply define a
> different media type, derived from JSON; that may be more flexible for
>  clients that want JSON and they do not care about the relationship to
>  JSON-LD. I am not saying this is what we should do, but it may be a
> data point to consider.
>

-1 to defining new media types when a solution already exists.  Browsers
and other clients would then need to process n different media types that
are all really just JSON, and really just JSON-LD.  There's already a big
enough issue with application/json and application/ld+json in terms of
browser and web server (eg mime magic mapping of .json and .jsonld)
support.   Consider XML schemas, or all of the many many many plain JSON
API formats out there already -- if everyone were to take that approach the
media type list would be inflated many-fold and implementers would throw up
their hands in despair.

Rob

-- 
Rob Sanderson
Information Standards Advocate
Digital Library Systems and Services
Stanford, CA 94305
Received on Friday, 10 April 2015 15:26:03 UTC