Re: Turtle Trouble (was: Re: [protocol] Patch formats) from Benjamin Young on 2015-06-10 (public-annotation@w3.org from June 2015)

From: Benjamin Young <bigbluehat@hypothes.is>
Date: Wed, 10 Jun 2015 10:13:45 -0400
To: Ivan Herman <ivan@w3.org>
Cc: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>, Robert Sanderson <azaroth42@gmail.com>, W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <CAE3H5FLmW==DRQ-02QOhehQLgp+CrrmW0Ab3B61Cf3sGobkM-w@mail.gmail.com>
On Wed, Jun 10, 2015 at 9:11 AM, Ivan Herman <ivan@w3.org> wrote:

>
> > On 10 Jun 2015, at 15:03 , Stian Soiland-Reyes <
> soiland-reyes@cs.manchester.ac.uk> wrote:
> >
> > While I understand the desire to simplify server development, I don't
> > see how it can be a big hurdle for a server implementation to generate
> > Turtle from JSON-LD or vice versa as there is a plethoria of RDF
> > support for almost any practical programming language (which one have
> > you got in mind?),
>
> Stian,
>
> I believe the issue is that this statement may not be true. I am happy to
> see Java is covered (I suspected that would be the case) and so is Ruby or
> Python. But my experience with support in Javascript is not that good
> (although it would be feasible to write a server in node.js). I am happy if
> I am proven wrong, though.
>


The core point is less about "are there tools / libraries available" and
more about "how hard is it for developers to build a server or client."

Right now, it looks like if they're building a server they'll need to (at
least):
a) know what Turtle is
b) find a tool for their language to transform it into something they can
store
c) know how to transform it back to Turtle (for those who ask)
d) know if they've done any of that correctly (which assumes they
understand graphs, transformations, etc)

For folks building a client, it's less complex:
a) know how to send a proper Accept header
b) know that JSON-LD can be treated as "just JSON"
c) know how to follow links found in HTTP headers

The client ones are hopefully pretty painless for anyone who's done "AJAX"
in the last half decade. ;)

The server ones, though, likely don't map to "most" (...I've not got a
ruler handy...) developers--especially those who are not (and/or have not)
worked with anything that thinks in graphs.

In the NoSQL world (for one place), there are *loads* of databases that
speak JSON on the wire and can store JSON-LD without any additional setup
or work (Apache CouchDB, Basho's Riak, and MongoDB among them). CouchDB (at
least) also nearly has matching semantics to LDP, and were it not for the
Turtle requirement could be very simply wrapped to accept JSON-LD from an
LDP client, store it, send it back when asked, and generate a container
listing.

I started down such a road--building a CouchApp that lives inside CouchDB
with it's in-database JS engine (based on SpiderMonkey).
https://github.com/BigBlueHat/ldp-on-couchdb

All was well, until I hit the Turtle requirement, and then I got sucked in
the undertow of data transformations. :-/

I intend to revisit that project soon--ignoring the Turtle requirement for
now--and see how far I get. It won't be an LDP server (...so it's name will
eventually change...), but it will likely be a nearly matching server that
would work for an Annotation API, "cost" developers little in terms of
know-how to see what it's doing ("JSON goes in; JSON comes out"), and still
be kind-a-sort-a close to the LDP spec (or at least as close as I can get
it). :)

Building LDP-based Annotation API servers on any of these other JSON stores
will be similar. If there's a Turtle requirement, the implementer will have
to a) care and b) know how to Do It Right (...both directions). I'm not
sure that's most developers...

Making Turtle optional, would solve that problem (afaik). Perhaps, the
Annotation API looks like a limited sub-set of LDP. Perhaps it looks like
an API who copied all the easy answers out of LDP's text book.

Regardless, I do feel there's a good way forward, and that this group will
find it. :)

Thanks!
Benjamin
--
Developer Advocate
http://hypothes.is/



>
> Ivan
>
>
> > with as you are mentioning here, the option to call
> > out to other binaries.
> >
> > As for generating, remember that N-Triples is valid Turtle and pretty
> > easy to make.  The clients don't need to deal with both formats as
> > they can just pick and stick with one of them.
> >
> >
> >
> >
> > Jena includes the "riot" command line tool which can be used
> > independently for say Turtle to JSON-LD or JSON-LD to Turtle, e.g.:
> >
> > stain@biggie-utopic:~/Downloads$ riot --output=jsonld void.ttl.gz
> > {
> >  "@graph" : [ {
> >    "@id" : "http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#",
> >    "@type" : "void:DatasetDescription",
> >    "description" : "This is the VoID description for a ChEMBL-RDF
> dataset",
> >    "issued" : "2015-01-14T00:00:00.000Z",
> >    "title" : "ChEMBL-RDF VoID Description",
> >    "createdBy" : "http://orcid.org/0000-0002-8011-0300",
> >    "createdOn" : "2009-10-28T00:00:00.000Z",
> >    "lastUpdateOn" : "2015-01-14T00:00:00.000Z",
> >    "previousVersion" : "
> http://rdf.ebi.ac.uk/dataset/chembl/19.0/void.ttl#",
> >    "primaryTopic" : ":chembl_rdf_dataset"
> >  }, {
> > ...
> >
> > The auto-generated @context and @prefix are usually quite nice and
> > makes readable JSON-LD, except that there is a bug in that it permits
> > "" as a namespace prefix.  (fixed for the next release -
> > https://issues.apache.org/jira/browse/JENA-934)
> >
> >
> > You can combine this with the JSON-LD Java tool "jsonldplayground" to
> > force a custom context or frame.
> > https://github.com/jsonld-java/jsonld-java
> >
> > It can also read/write Turtle directly - but you would have to
> > manually provide a --context to style it.
> >
> > stain@biggie-utopic:~/src/jsonld-java$ ./jsonldplayground
> > Missing required option(s) [inputFile, process]
> > Option (* = required)                  Description
> > ---------------------                  -----------
> > --base <base URI>                      (default: )
> > --context <File: The context>
> > --format [RDFFormat: The output file
> >  format to use. Defaults to nquads.
> >  Valid values are: [turtle, rdfjson,
> >  rdfxml, trig, nquads, jsonld, trix,
> >  ntriples]]
> > --help
> > * --inputFile <File: The input file>
> > --outputForm [The way to output the    (default: expanded)
> >  results from fromRDF. Defaults to
> >  expanded. Valid values are:
> >  [compacted, expanded, flattened]]
> > * --process <The processing to
> >  perform. Valid values are: [expand,
> >  compact, frame, normalize, flatten,
> >  fromrdf, tordf]>
> >
> > On 10 June 2015 at 06:01, Ivan Herman <ivan@w3.org> wrote:
> >> The answer I have got is less upbeat than what I hoped for. Although
> there are tools that either exist or can be done, the quality, mainly in
> terms of human readability, is not equal. It seems that Gregg's ruby is the
> only one that tries to create a reasonable default context using prefixes
> defined in Turtle (or other serialization).
> >>
> >> In general, there aren't any purpose-built Turtle-to-JSON-LD libraries,
> just as you don’t find RDFa-to-Turtle libraries, it’s usually done as part
> of a system which includes multiple components. This is the case of the
> solution I have outlined: RDFLib is a large (Python) RDF library; it has a
> Turtle and a JSON-LD parser and serializer, ie, it is possible to use it as
> a transformer. I would suspect (but I am not sure) that Jena has something
> like that for Java, for example.
> >>
> >> However… coming back to the original issue, and also reflecting on
> Robert Casties, we may have to think (maybe together with the LDP group)
> whether it is acceptable to relief the requirements somehow, and turn the
> Turtle version into an optional feature.
> >>
> >> Ivan
> >>
> >>
> >> [[
> >> Mine is the only one I’m aware of that tries to create a reasonable
> default context using prefixes defined in Turtle (or other serialization).
> As you know, the algorithm doesn’t describe a way to construct a context
> automatically, but neither do any other RDF serializations, which focus on
> parsing rather than generating. Jena may do something.
> >>
> >> Also note that the Linked Open Vocabularies [1] group maintain a
> JSON-LD context with prefix definitions for all the vocabularies they
> maintain, which can be used with any JSON-LD toolchain by compacting the
> result using one or more context URLs, but won’t get vocabulary-specific
> term definitions. Schema’s can be used for schema.org, which is probably
> the best recommendation there. There’s a list of other supported context
> here [2].
> >>
> >> Note that you won’t find many purpose-built Turtle-to-JSON-LD
> libraries, just as you don’t find RDFa-to-Turtle libraries, it’s usually
> done as part of a system which includes multiple components. In my case,
> it’s possible to gather prefix definitions when parsing and forward them to
> the serializer, which is what the distiller does. Other libraries may
> provide a similar facility.
> >>
> >> In the case of an Annotation API server, I suspect they’re using a
> particular vocabulary, or at least control what they use, so they’re
> probably in the best position to create a context to apply to the JSON-LD
> serializer, just as they likely manage the prefixes used in the Turtle
> serialization; this will allow more control of the way the JSON-LD is
> shaped, as would using it as a JSON-LD Frame, and the Framing algorithm. I
> also have a tool to construct a context given an RDFS/OWL vocabulary [3].
> >>
> >> Probably best to suggest he ask on #jsonld, StackOverflow or
> public-linked-json@w3.org for other input.
> >>
> >>
> >>
> >>> On 08 Jun 2015, at 20:21 , Benjamin Young <bigbluehat@hypothes.is>
> wrote:
> >>>
> >>> On Mon, Jun 8, 2015 at 1:01 PM, Ivan Herman <ivan@w3.org> wrote:
> >>>
> >>>> On 08 Jun 2015, at 16:21 , Benjamin Young <bigbluehat@hypothes.is>
> wrote:
> >>>>
> >>>> This email's been in my head for awhile (too long probably) and this
> patch thread tipped it to the point of my fingers typing what's in my
> brain. :) So here goes...
> >>>>
> >>>> First (and foremost, maybe), I actually really like Turtle. :)
> >>>>
> >>>> However....it requires a way of thinking that the prevailing systems
> (non-graph databases, browsers, JS runtimes, etc) don't currently think in.
> >>>>
> >>>> We've addressed that in the data model by preferring / promoting the
> JSON-LD representation in examples--while still providing the Turtle
> representation for those that support it.
> >>>>
> >>>> Where things fall down for me (at least) are with the protocol
> specification--which, being based on LDP (which is otherwise quite
> fabulous) comes with the requirement that:
> >>>> http://www.w3.org/TR/ldp/#h4_ldprs-HTTP_GET
> >>>>> ...MUST respond with a Turtle representation...when the request
> includes an Accept header specifying text/turtle
> >>>>> ...SHOULD respond with a text/turtle...whenever the Accept request
> header is absent.
> >>>>> ...MUST respond with a application/ld+json representation...when the
> request includes an Accept header specifying application/ld+json
> >>>>
> >>>> What this means practically (afaik) is that an Annotation API server
> MUST be able to transform their stored info into both Turtle and JSON-LD
> (regardless of which was sent in).
> >>>>
> >>>> There aren't (that I've found) terribly many Turtle-to-JSON-LD
> transformation libraries. I've used this one (recently relicensed to Apache
> License 2.0) with varied success:
> >>>> https://github.com/warpr/turtle-to-jsonld
> >>>
> >>> I will ask around. I think if we allow for non-Javascript converters,
> too, then there are more. I know there is a JSON-LD module to RDFLib, so it
> is fairly easy to write a Python program to convert from one format to the
> other. Gregg Kellogg has a similar tool for Ruby. I think both are fairly
> good; the JSON-LD part was written by people from the JSON-LD group itself.
> I may get more info (from Gregg)
> >>>
> >>> (Note that I am mostly offline tomorrow, so the info may come on
> Wednesday only)
> >>>
> >>> Thanks, Ivan!
> >>>
> >>> Sorry if I misrepresented the "coverage area" of the tooling. The
> focus on JS was mostly browser-driven--which is where I currently expect
> most annotation clients live. Server-side stuff is more amenable of course.
> :)
> >>>
> >>> Thanks in advance for the links. It will be a useful list to reference
> whatever else we decide.
> >>>
> >>> Cheers!
> >>> Benjamin
> >>>
> >>>
> >>> ivan
> >>>
> >>>>
> >>>> However, that (plus it's dependencies) provides a transformation to a
> JSON-LD format that may actually not be what one wants in the end, and then
> requires yet-more transformation and more understanding of the "meta model"
> by the API server (and/or database) to move between the formats.
> >>>>
> >>>> Here's where this hits the PATCH format options....
> >>>>
> >>>> On Sat, Jun 6, 2015 at 12:57 AM, Ivan Herman <ivan@w3.org> wrote:
> >>>>
> >>>>> On 05 Jun 2015, at 20:51 , Robert Sanderson <azaroth42@gmail.com>
> wrote:
> >>>>>
> >>>>>
> >>>>> In reading back through the discussion at the face to face about the
> protocol draft, it was noted that there are many possible patch formats,
> including LDPatch, JSON Patch, Sparql Update, diff and so on.  All would be
> possible to use, and some easier in different circumstances.
> >>>>>
> >>>>> Do we want to:
> >>>>>
> >>>>> a)  Specify one as a requirement (MUST) and let the others be usable
> (MAY)
> >>>>> b)  Not specify any as a requirement and just remain silent on which
> one to use.
> >>>>>
> >>>>> If B is the preference, then we would need to decide how the server
> advertises which of the PATCH formats it implements so that clients can
> determine how (if at all) they can interact.
> >>>>>
> >>>>> My preference is A, and to pick LDPatch (by reference) as part of
> the LDP stable of specifications, but what do people think?
> >>>>
> >>>> My preference is also A, although the issue of advertising may still
> be relevant. ('may'. We may decide not to address this issue.)
> >>>>
> >>>> If the database or API server you are building supports a graph-based
> "meta model" then supporting the transformation between Turtle and JSON-LD
> or LDPatch or anything else triple-based is "just some more code." :)
> >>>>
> >>>> However, if your database does not (most databases don't...even if
> they "speak" JSON), then handling LDPatch is an even farther reach than
> supporting Turtle.
> >>>>
> >>>> Here's an LDPatch example for those who are curious:
> >>>> http://www.w3.org/TR/ldpatch/#full-example
> >>>>
> >>>>
> >>>>> Benjamin suggested at the F2F a preference for JSON Patch, for
> example.
> >>>>
> >>>> Good memory, Rob! :)
> >>>>
> >>>> The preference is completely along these lines:
> >>>> - most available tooling supports JSON
> >>>> - "understanding" the `-LD` bit of JSON-LD is at some level
> "optional" (at least for storage, basic parsing, and transportation)
> >>>> - if a PATCH format is chosen, it should be equally "dumb" (in the
> best possible way) ;)
> >>>>
> >>>> Because:
> >>>> - if developers can deal with annotations as JSON (+/- the `-LD`
> knowhow), they can start using annotation data now with very little
> additional effort added to their stack
> >>>> - if developers *want* to use PATCH, having the option of a patch
> format equally as "dump" (just dealing with keys and values, not triples),
> means (again) that they can start with (nearly) what they already know and
> have.
> >>>>
> >>>> Here's the list of examples from the JSON Patch (RFC 6902) spec:
> >>>> http://tools.ietf.org/html/rfc6902#appendix-A
> >>>>
> >>>>
> >>>> I do not have a strong feeling on whether it is json patch or
> ldpatch, not really familiar with the details and certainly no experience.
> I. I have a slight preference to JSON, however; as far as I can see,
> LDPatch is based on a turtle syntax, and we did make a decision to put
> JSON-LD forward as our primary syntax in the model (in view of our
> constituency). In this respect JSON patch seems to be more in line with the
> rest.
> >>>>
> >>>> I suppose it comes down to "cutting with the grain" of what's already
> in place--with the option to "be smarter" if you know how to be. :)
> >>>>
> >>>> I'm all for having Turtle as an *option* and (if I have that) also
> having LDPatch as an *option.*
> >>>>
> >>>> However, if these become mandatory, I fear we're cutting off a large
> part of the potential integration, implementation, and consuming developers.
> >>>>
> >>>> I'd love to see annotation data as widely used as feeds were "back in
> the day."
> >>>>
> >>>> I think it's possible, but (at least right now) I think that means
> keeping the "smarter" graph stuff as optional bits and not required
> defaults--as they are in LDP.
> >>>>
> >>>> Ideally, we find a way to spec our Annotation API that is at once
> "simple" and also LDP compatible.
> >>>>
> >>>> Is that feasible?
> >>>>
> >>>> Did any of this make sense? :)
> >>>>
> >>>> Thanks for listening regardless. ;)
> >>>>
> >>>> Cheers,
> >>>> Benjamin
> >>>> --
> >>>> Developer Advocate
> >>>> http://hypothes.is/
> >>>>
> >>>>
> >>>> (Maybe there is an Abis possibility: require JSON and LDPatch? Or is
> that too much?)
> >>>>
> >>>> Ivan
> >>>>
> >>>>>
> >>>>> Thanks!
> >>>>>
> >>>>> Rob
> >>>>>
> >>>>> --
> >>>>> Rob Sanderson
> >>>>> Information Standards Advocate
> >>>>> Digital Library Systems and Services
> >>>>> Stanford, CA 94305
> >>>>
> >>>>
> >>>> ----
> >>>> Ivan Herman, W3C
> >>>> Digital Publishing Activity Lead
> >>>> Home: http://www.w3.org/People/Ivan/
> >>>> mobile: +31-641044153
> >>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> ----
> >>> Ivan Herman, W3C
> >>> Digital Publishing Activity Lead
> >>> Home: http://www.w3.org/People/Ivan/
> >>> mobile: +31-641044153
> >>> ORCID ID: http://orcid.org/0000-0003-0782-2704
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >> ----
> >> Ivan Herman, W3C
> >> Digital Publishing Activity Lead
> >> Home: http://www.w3.org/People/Ivan/
> >> mobile: +31-641044153
> >> ORCID ID: http://orcid.org/0000-0003-0782-2704
> >>
> >>
> >>
> >>
> >
> >
> >
> > --
> > Stian Soiland-Reyes, eScience Lab
> > School of Computer Science
> > The University of Manchester
> > http://soiland-reyes.com/stian/work/
> http://orcid.org/0000-0001-9842-9718
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
>
>
>
>
>
Received on Wednesday, 10 June 2015 14:14:16 UTC