Re: JSON-LD serialization and linked data support from Benjamin Young on 2015-08-14 (public-annotation@w3.org from August 2015)

From: Benjamin Young <bigbluehat@hypothes.is>
Date: Fri, 14 Aug 2015 09:27:38 -0400
To: Ivan Herman <ivan@w3.org>
Cc: James M Snell <jasnell@gmail.com>, Frederick Hirsch <w3c@fjhirsch.com>, W3C Public Annotation List <public-annotation@w3.org>, Tim Cole <t-cole3@illinois.edu>, Rob Sanderson <azaroth@stanford.edu>
Message-ID: <CAE3H5FL-p2MMU0hw+UVupS-zQ1ossLez4UwoW08AnueWmjpXWg@mail.gmail.com>
On Fri, Aug 14, 2015 at 1:07 AM, Ivan Herman <ivan@w3.org> wrote:

>
> > On 13 Aug 2015, at 20:34 , Benjamin Young <bigbluehat@hypothes.is>
> wrote:
> >
> > On Thu, Aug 13, 2015 at 12:10 PM, Ivan Herman <ivan@w3.org> wrote:
> >
> > > On 13 Aug 2015, at 17:06 , James M Snell <jasnell@gmail.com> wrote:
> > >
> > > If I can interject a few thoughts from the sidelines... I faced a
> > > similar decision with regards to Activity Streams 2.0 -- only I came
> > > at it from the opposite point of view. That is, we had a pure JSON
> > > syntax to start and moved to a Vocabulary model with a JSON-LD syntax.
> > > One of the key goals of this move, however, has been to make sure that
> > > developers who wish to ignore the JSON-LD processing model can do so
> > > if they wish -- albeit at a cost of some features.
> > >
> > > The short version of the story is that Activity Streams 2.0 builds on
> > > JSON-LD but requires only a subset of what JSON-LD provides. For
> > > instance, the data format *requires* JSON-LD compact form
> > > serialization, it requires use of a normative JSON-LD @context
> > > definition that ensures consistent serialization, it strongly
> > > recommends that certain JSON-LD features are avoided, and -- perhaps
> > > most importantly -- does not require that developers implement the
> > > full RDF world view in order to make sense of the data.
> > >
> >
> > FWIW, we have arrived to something similar in the CSV on the Web Working
> Group. That WG defines a metadata for CSV data; the format is JSON-LD
> compatible but we, essentially, defined a subset that should be manageable
> without a JSON-LD tools. We actually pushed back on features that would
> have required such tools.
> >
> > Do you have a link for that? I'd like to read up. :)
>
> Well, it is difficult to give a specific link to the discussion proper.
> The result are the four documents that group published recently:
>
> - The abstract 'annotated tabular data model':
>   http://www.w3.org/TR/2015/CR-tabular-data-model-20150716/
>
> (The term annotation is used in the 'traditional' sense, not as a Web
> Annotation sense, meaning a bunch of information added to various elements
> of tabular data)
>
> - One way of expressing metadata, that can be mapped to the abstract
> tabular data model:
>   http://www.w3.org/TR/2015/CR-tabular-metadata-20150716/
>
> The metadata itself is expressed as a slightly specialized JSON-LD. It is
> a JSON expression of the metadata with very few JSON-LD specific features.
> As James said, it is usable and interpretable without a JSON-LD processor,
> but with the suitable @context it can be used as such, ie, it can be
> considered as RDF metadata of a specific CSV content
>
> - Mapping of the general annotated tabular data model to JSON:
>   http://www.w3.org/TR/2015/CR-csv2rdf-20150716/
>
> No mention of anything JSON-LD. The structure is close to the RDF mapping
> but not strictly so; goal was to provide something meaningful for JSON users
>
> - Mapping of the  general annotated tabular data model to RDF:
>   http://www.w3.org/TR/2015/CR-csv2json-20150716/
>
> Note that the mapping is on *RDF*, and is agnostic to any specific
> serialization.
>
> I hope it helps
>

It did! Thank you, Ivan.

Here's the most interesting links I gleaned (given our topic):

http://www.w3.org/TR/2015/CR-csv2json-20150716/#json-ld-to-json
"This section defines a mechanism for transforming the [json-ld
<http://www.w3.org/TR/2015/CR-csv2json-20150716/#bib-json-ld>] dialect
<http://www.w3.org/TR/2015/CR-tabular-metadata-20150716/#json-ld-dialect>
used for non-core annotations
<http://www.w3.org/TR/2015/CR-csv2json-20150716/#dfn-non-core-annotations>
and notes <http://www.w3.org/TR/2015/CR-csv2json-20150716/#dfn-notes>
originating from the processing of metadata (as defined in [tabular-metadata
<http://www.w3.org/TR/2015/CR-csv2json-20150716/#bib-tabular-metadata>])
into JSON."

Essentially, turning {"@value": "content"} into just "content". The output
JSON is (therefore) simpler, but at the cost of needing to:
a) know which you have JSON or JSON-LD
b) possibly parsing for both--just in case someone else didn't do the
conversion
c) lossing @language and @type--which means it's a lossy transition...

This is our "bodies as string literals" discussion in someone else's spec.
;)

As you scroll down from that link (and you should!), you'll find examples
of "just JSON" and JSON-LD.

As ever, one is more "legible" and the other one more "valuable" (as it
contains more information and ways to understand what's it's saying in the
first place).

For now, I'd be +1 on our digging deeper into the multiple bodies scenario
and try (some more) to see how simple we can get it--in both "just JSON"
and JSON-LD.

I'm not sure our other examples are so painful as to warrant throwing out
the `-LD` bits, and I'm also not sure that this multiple-bodies thing
warrants doing that either as we've only just begin throwing examples out.

Let's get back to crafting code and content, and see what bridges we can
build. :)


>
> Ivan
>
>
>
> >
> > Cheers!
> > Benjamin
> >
> >
> > Ivan
> >
> >
> > > A similar approach can be applied here. By defining a normative
> > > JSON-LD @context and requiring compact serialization using that
> > > @context, and by limiting the JSON-LD specific features you depend on,
> > > you can place practical limits on those various JSON-LD idiosyncrasies
> > > that everyone loves to hate.
> > >
> > > - James
> > >
> > > On Thu, Aug 13, 2015 at 6:16 AM, Ivan Herman <ivan@w3.org> wrote:
> > >> Frederick, I put Tim and Rob into the Cc list just to make it clear
> that this is not a direct answer to this mail but, rather, the three mails
> in this thread ([1,2]), and also Rob's separate mail[3].
> > >>
> > >> (Apologies if parts of what I write is obvious to some of the people
> on the group. It may not be for others…)
> > >>
> > >> The annotation model is *not* in JSON-LD. Nor is it in Turtle, for
> that matter. It is in RDF. RDF is defined in terms of abstract concepts
> (IRI-s as identifiers, literals, blank nodes, triples, etc.) defined in the
> RDF1.1 Concept document[4]; that document is *serialization agnostic*.
> (<digress> it has been one of the biggest mistake ever in the history of
> RDF that the concept and a particular serialization in XML, ie, RDF/XML,
> have been conflated in the story line. This has done more harm to RDF than
> anything else!</digress>). There are quite a number of serialization
> syntaxes (Turtle, JSON-LD, RDFa, N-Triples, RDF/XML, there is even a simple
> JSON serialization, though not as a Rec).
> > >>
> > >> I believe that, at this point, nobody (including Paolo) is
> considering moving away from the model. It is a model in RDF and, so far,
> it has served us well. In other words, we are firmly in the domain of
> Linked Data. We should get this issue off the table.
> > >>
> > >> RDF can be serialized. We use already two of those in our document:
> Turtle and JSON-LD. Other people may use other serialization for OA: RDFa
> or, (God forbid!) RDF/XML. The model is oblivious to that and we cannot
> even forbid that to happen.
> > >>
> > >> In my *personal* opinion, Semantic Web people would use Turtle, which
> is a simple, straightforward representation of the model. But it is an
> alien syntax to most, so we decided to push JSON to the fore. To achieve
> that, we are looking at a particular *serialization* of RDF, which is
> JSON-LD. We are hoping that this works for us, including those among us who
> do not care about RDF. But JSON-LD has its idiosyncrasies that some may
> live with, but others do not. It has the advantage of being a generic RDF
> serialization, but it also has the disadvantage of being a generic RDF
> serialization:-)
> > >>
> > >> Here comes Paolo's proposal (at least the way I understand it): let
> us *replace* the JSON-LD serialization with a dedicated JSON serialization
> of our model. Ie, we drop the -LD *from the syntax* (but that does not mean
> dropping Linked Data) and we may replace it with -OA to yield something
> like JSON-OA. What a JSON-LD processor does is to map a generic JSON-LD
> file to the abstract RDF model; well, we can define a processor that does
> the same *to a very restricted JSON syntax* that is defined for the
> annotation model only. There is no real interoperability issue: we drop
> JSON-LD, and we require JSON-OA to be the interchange format; for Linked
> Data aware systems there is a processor that maps this the internal
> representation of RDF, whereas non-Linked Data aware systems can use that
> particular JSON dialect only.
> > >>
> > >> In fact, this is not so far off from what Rob proposed in [1]:
> > >>
> > >> [[[
> > >> * Define the model to fully encapsulate all of the requirements
> without taking into consideration any serialization or convenience.
> > >> * The on-the-wire bits are the JSON-LD serialization of that model.
> We can discuss later whether we need to require a specific crystalization
> or whether we can just say JSON-LD.
> > >> * We provide implementations that take that serialization and further
> compact it into whatever structure is most useful, but those are
> non-normative. They're code that we can write to make developers' lives
> easier.
> > >> ]]]
> > >>
> > >> But, I think:
> > >>
> > >> * Per point 1: we have the model, and we should not change it
> > >> * Per point 2: we can, actually, use JSON-OA as a the on-the-wire
> bits as a serialization of that model (yeah, I know, this is a bit touchy
> with the definition of LDP, let us see whether we can solve that)
> > >> * Per point 3: JSON-OA *may* be the normative serialization and we
> ditch JSON-LD altogether
> > >>
> > >> This approach may or may not work. Tim may be right that the proper
> modeling of the problem area would lead us to a certain level of
> complication anyway, and the whole thing may not lead to a real
> simplification compared to JSON-LD. In which case we declare this a dead
> end and we may be stuck with JSON-LD. But let us not pretend that by trying
> to that we create more interoperability problems (we don't, because there
> is a plethora of RDF serializations out there already) or that we drop
> Linked Data approach from our model (we don't because we touch only a
> particular serialization of the model).
> > >>
> > >> Ivan
> > >>
> > >> P.S. a different remark: yes, JSON-LD is included in schema.org, ie,
> Google think it is ready and easy for… webmasters! Not developers in
> general…
> > >>
> > >>
> > >> [1]
> http://www.w3.org/mid/CABevsUFyszpujiZq2qGd-wUQVvzzBgHY6K9sAKcatyjdj16PUA@mail.gmail.com
> > >> [2]
> http://www.w3.org/mid/009201d0d585$696b9810$3c42c830$@illinois.edu
> > >> [3]
> http://www.w3.org/mid/CABevsUGMeisPtx3xgxv1Dy52nmnUuoaRwWfi2Q10X5QJhr-0JA@mail.gmail.com
> > >> [4] http://www.w3.org/TR/rdf11-concepts/
> > >>
> > >>
> > >>> On 13 Aug 2015, at 24:15 , Frederick Hirsch <w3c@fjhirsch.com>
> wrote:
> > >>>
> > >>> On today's call the topic of serializations came up and a question
> seemed to be raised over whether JSON-LD should be used (perhaps I heard
> incorrectly)
> > >>>
> > >>> There are some strong reasons to continue to require JSON-LD as a
> mandatory serialization, the abstract argument being the value of linked
> data on the back end.
> > >>>
> > >>> A specific concrete example of the value of linked data in
> combination with annotations might be "CATCH: Common Annotation, Tagging,
> and Citation at Harvard"
> > >>>
> > >>> [[
> > >>>
> > >>> It is designed to interoperate with third-party annotation tools to
> aggregate and associate contextualized annotation metadata from various
> pedagogical and research tools with reference to persistent digital media
> in repositories, such as the Harvard Library DRS. - See more at:
> https://osc.hul.harvard.edu/liblab/projects/catch-common-annotation-tagging-and-citation-harvard#sthash.fr7L4qa3.dpuf
> > >>>
> > >>> ]]
> > >>>
> > >>> Do we have other concrete examples of how the linked data aspect of
> the Open Annotation model adds value to annotations? Pointers would be
> welcome.
> > >>>
> > >>> I'm concerned about specifying multiple serializations as we have to
> be more careful of interoperability in this case, specifically is
> round-tripping without information loss despite the serialization a
> potential issue? More serializations also mean more testing.
> > >>>
> > >>> In a related thought, is directly embedding JSON-LD in HTML (
> http://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents ) a
> viable option? What is the status of browser support for this? If it is
> supported (or is in progress) what is the case for HTML serialization as an
> alternative? Would it be more productive to focus on generic support for
> JSON-LD in browsers rather than a specific annotation serialization?
> > >>>
> > >>> The fundamental issue I heard us discuss is that even with all our
> efforts to simplify the JSON-LD serialization, there will remain some
> aspects that do not appear 'natural' to JSON developers.  The next question
> I have is whether these aspects can be managed with suitable libraries etc.
> > >>>
> > >>> Thanks
> > >>>
> > >>> regards, Frederick
> > >>>
> > >>> Frederick Hirsch
> > >>>
> > >>> www.fjhirsch.com
> > >>> @fjhirsch
> > >>>
> > >>>
> > >>
> > >>
> > >> ----
> > >> Ivan Herman, W3C
> > >> Digital Publishing Activity Lead
> > >> Home: http://www.w3.org/People/Ivan/
> > >> mobile: +31-641044153
> > >> ORCID ID: http://orcid.org/0000-0003-0782-2704
> > >>
> > >>
> > >>
> > >>
> > >
> >
> >
> > ----
> > Ivan Herman, W3C
> > Digital Publishing Activity Lead
> > Home: http://www.w3.org/People/Ivan/
> > mobile: +31-641044153
> > ORCID ID: http://orcid.org/0000-0003-0782-2704
> >
> >
> >
> >
> >
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
>
>
>
>
>
Received on Friday, 14 August 2015 13:28:11 UTC