Re: Turtle Trouble (was: Re: [protocol] Patch formats) from Ivan Herman on 2015-06-10 (public-annotation@w3.org from June 2015)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 10 Jun 2015 15:11:56 +0200
To: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Cc: Benjamin Young <bigbluehat@hypothes.is>, Robert Sanderson <azaroth42@gmail.com>, W3C Public Annotation List <public-annotation@w3.org>
Message-Id: <AF6A6D95-27B7-4425-A3D8-9EA774E0A08F@w3.org>
> On 10 Jun 2015, at 15:03 , Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk> wrote:
> 
> While I understand the desire to simplify server development, I don't
> see how it can be a big hurdle for a server implementation to generate
> Turtle from JSON-LD or vice versa as there is a plethoria of RDF
> support for almost any practical programming language (which one have
> you got in mind?),

Stian,

I believe the issue is that this statement may not be true. I am happy to see Java is covered (I suspected that would be the case) and so is Ruby or Python. But my experience with support in Javascript is not that good (although it would be feasible to write a server in node.js). I am happy if I am proven wrong, though.

Ivan


> with as you are mentioning here, the option to call
> out to other binaries.
> 
> As for generating, remember that N-Triples is valid Turtle and pretty
> easy to make.  The clients don't need to deal with both formats as
> they can just pick and stick with one of them.
> 
> 
> 
> 
> Jena includes the "riot" command line tool which can be used
> independently for say Turtle to JSON-LD or JSON-LD to Turtle, e.g.:
> 
> stain@biggie-utopic:~/Downloads$ riot --output=jsonld void.ttl.gz
> {
>  "@graph" : [ {
>    "@id" : "http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#",
>    "@type" : "void:DatasetDescription",
>    "description" : "This is the VoID description for a ChEMBL-RDF dataset",
>    "issued" : "2015-01-14T00:00:00.000Z",
>    "title" : "ChEMBL-RDF VoID Description",
>    "createdBy" : "http://orcid.org/0000-0002-8011-0300",
>    "createdOn" : "2009-10-28T00:00:00.000Z",
>    "lastUpdateOn" : "2015-01-14T00:00:00.000Z",
>    "previousVersion" : "http://rdf.ebi.ac.uk/dataset/chembl/19.0/void.ttl#",
>    "primaryTopic" : ":chembl_rdf_dataset"
>  }, {
> ...
> 
> The auto-generated @context and @prefix are usually quite nice and
> makes readable JSON-LD, except that there is a bug in that it permits
> "" as a namespace prefix.  (fixed for the next release -
> https://issues.apache.org/jira/browse/JENA-934)
> 
> 
> You can combine this with the JSON-LD Java tool "jsonldplayground" to
> force a custom context or frame.
> https://github.com/jsonld-java/jsonld-java
> 
> It can also read/write Turtle directly - but you would have to
> manually provide a --context to style it.
> 
> stain@biggie-utopic:~/src/jsonld-java$ ./jsonldplayground
> Missing required option(s) [inputFile, process]
> Option (* = required)                  Description
> ---------------------                  -----------
> --base <base URI>                      (default: )
> --context <File: The context>
> --format [RDFFormat: The output file
>  format to use. Defaults to nquads.
>  Valid values are: [turtle, rdfjson,
>  rdfxml, trig, nquads, jsonld, trix,
>  ntriples]]
> --help
> * --inputFile <File: The input file>
> --outputForm [The way to output the    (default: expanded)
>  results from fromRDF. Defaults to
>  expanded. Valid values are:
>  [compacted, expanded, flattened]]
> * --process <The processing to
>  perform. Valid values are: [expand,
>  compact, frame, normalize, flatten,
>  fromrdf, tordf]>
> 
> On 10 June 2015 at 06:01, Ivan Herman <ivan@w3.org> wrote:
>> The answer I have got is less upbeat than what I hoped for. Although there are tools that either exist or can be done, the quality, mainly in terms of human readability, is not equal. It seems that Gregg's ruby is the only one that tries to create a reasonable default context using prefixes defined in Turtle (or other serialization).
>> 
>> In general, there aren't any purpose-built Turtle-to-JSON-LD libraries, just as you don’t find RDFa-to-Turtle libraries, it’s usually done as part of a system which includes multiple components. This is the case of the solution I have outlined: RDFLib is a large (Python) RDF library; it has a Turtle and a JSON-LD parser and serializer, ie, it is possible to use it as a transformer. I would suspect (but I am not sure) that Jena has something like that for Java, for example.
>> 
>> However… coming back to the original issue, and also reflecting on Robert Casties, we may have to think (maybe together with the LDP group) whether it is acceptable to relief the requirements somehow, and turn the Turtle version into an optional feature.
>> 
>> Ivan
>> 
>> 
>> [[
>> Mine is the only one I’m aware of that tries to create a reasonable default context using prefixes defined in Turtle (or other serialization). As you know, the algorithm doesn’t describe a way to construct a context automatically, but neither do any other RDF serializations, which focus on parsing rather than generating. Jena may do something.
>> 
>> Also note that the Linked Open Vocabularies [1] group maintain a JSON-LD context with prefix definitions for all the vocabularies they maintain, which can be used with any JSON-LD toolchain by compacting the result using one or more context URLs, but won’t get vocabulary-specific term definitions. Schema’s can be used for schema.org, which is probably the best recommendation there. There’s a list of other supported context here [2].
>> 
>> Note that you won’t find many purpose-built Turtle-to-JSON-LD libraries, just as you don’t find RDFa-to-Turtle libraries, it’s usually done as part of a system which includes multiple components. In my case, it’s possible to gather prefix definitions when parsing and forward them to the serializer, which is what the distiller does. Other libraries may provide a similar facility.
>> 
>> In the case of an Annotation API server, I suspect they’re using a particular vocabulary, or at least control what they use, so they’re probably in the best position to create a context to apply to the JSON-LD serializer, just as they likely manage the prefixes used in the Turtle serialization; this will allow more control of the way the JSON-LD is shaped, as would using it as a JSON-LD Frame, and the Framing algorithm. I also have a tool to construct a context given an RDFS/OWL vocabulary [3].
>> 
>> Probably best to suggest he ask on #jsonld, StackOverflow or public-linked-json@w3.org for other input.
>> 
>> 
>> 
>>> On 08 Jun 2015, at 20:21 , Benjamin Young <bigbluehat@hypothes.is> wrote:
>>> 
>>> On Mon, Jun 8, 2015 at 1:01 PM, Ivan Herman <ivan@w3.org> wrote:
>>> 
>>>> On 08 Jun 2015, at 16:21 , Benjamin Young <bigbluehat@hypothes.is> wrote:
>>>> 
>>>> This email's been in my head for awhile (too long probably) and this patch thread tipped it to the point of my fingers typing what's in my brain. :) So here goes...
>>>> 
>>>> First (and foremost, maybe), I actually really like Turtle. :)
>>>> 
>>>> However....it requires a way of thinking that the prevailing systems (non-graph databases, browsers, JS runtimes, etc) don't currently think in.
>>>> 
>>>> We've addressed that in the data model by preferring / promoting the JSON-LD representation in examples--while still providing the Turtle representation for those that support it.
>>>> 
>>>> Where things fall down for me (at least) are with the protocol specification--which, being based on LDP (which is otherwise quite fabulous) comes with the requirement that:
>>>> http://www.w3.org/TR/ldp/#h4_ldprs-HTTP_GET
>>>>> ...MUST respond with a Turtle representation...when the request includes an Accept header specifying text/turtle
>>>>> ...SHOULD respond with a text/turtle...whenever the Accept request header is absent.
>>>>> ...MUST respond with a application/ld+json representation...when the request includes an Accept header specifying application/ld+json
>>>> 
>>>> What this means practically (afaik) is that an Annotation API server MUST be able to transform their stored info into both Turtle and JSON-LD (regardless of which was sent in).
>>>> 
>>>> There aren't (that I've found) terribly many Turtle-to-JSON-LD transformation libraries. I've used this one (recently relicensed to Apache License 2.0) with varied success:
>>>> https://github.com/warpr/turtle-to-jsonld
>>> 
>>> I will ask around. I think if we allow for non-Javascript converters, too, then there are more. I know there is a JSON-LD module to RDFLib, so it is fairly easy to write a Python program to convert from one format to the other. Gregg Kellogg has a similar tool for Ruby. I think both are fairly good; the JSON-LD part was written by people from the JSON-LD group itself. I may get more info (from Gregg)
>>> 
>>> (Note that I am mostly offline tomorrow, so the info may come on Wednesday only)
>>> 
>>> Thanks, Ivan!
>>> 
>>> Sorry if I misrepresented the "coverage area" of the tooling. The focus on JS was mostly browser-driven--which is where I currently expect most annotation clients live. Server-side stuff is more amenable of course. :)
>>> 
>>> Thanks in advance for the links. It will be a useful list to reference whatever else we decide.
>>> 
>>> Cheers!
>>> Benjamin
>>> 
>>> 
>>> ivan
>>> 
>>>> 
>>>> However, that (plus it's dependencies) provides a transformation to a JSON-LD format that may actually not be what one wants in the end, and then requires yet-more transformation and more understanding of the "meta model" by the API server (and/or database) to move between the formats.
>>>> 
>>>> Here's where this hits the PATCH format options....
>>>> 
>>>> On Sat, Jun 6, 2015 at 12:57 AM, Ivan Herman <ivan@w3.org> wrote:
>>>> 
>>>>> On 05 Jun 2015, at 20:51 , Robert Sanderson <azaroth42@gmail.com> wrote:
>>>>> 
>>>>> 
>>>>> In reading back through the discussion at the face to face about the protocol draft, it was noted that there are many possible patch formats, including LDPatch, JSON Patch, Sparql Update, diff and so on.  All would be possible to use, and some easier in different circumstances.
>>>>> 
>>>>> Do we want to:
>>>>> 
>>>>> a)  Specify one as a requirement (MUST) and let the others be usable (MAY)
>>>>> b)  Not specify any as a requirement and just remain silent on which one to use.
>>>>> 
>>>>> If B is the preference, then we would need to decide how the server advertises which of the PATCH formats it implements so that clients can determine how (if at all) they can interact.
>>>>> 
>>>>> My preference is A, and to pick LDPatch (by reference) as part of the LDP stable of specifications, but what do people think?
>>>> 
>>>> My preference is also A, although the issue of advertising may still be relevant. ('may'. We may decide not to address this issue.)
>>>> 
>>>> If the database or API server you are building supports a graph-based "meta model" then supporting the transformation between Turtle and JSON-LD or LDPatch or anything else triple-based is "just some more code." :)
>>>> 
>>>> However, if your database does not (most databases don't...even if they "speak" JSON), then handling LDPatch is an even farther reach than supporting Turtle.
>>>> 
>>>> Here's an LDPatch example for those who are curious:
>>>> http://www.w3.org/TR/ldpatch/#full-example
>>>> 
>>>> 
>>>>> Benjamin suggested at the F2F a preference for JSON Patch, for example.
>>>> 
>>>> Good memory, Rob! :)
>>>> 
>>>> The preference is completely along these lines:
>>>> - most available tooling supports JSON
>>>> - "understanding" the `-LD` bit of JSON-LD is at some level "optional" (at least for storage, basic parsing, and transportation)
>>>> - if a PATCH format is chosen, it should be equally "dumb" (in the best possible way) ;)
>>>> 
>>>> Because:
>>>> - if developers can deal with annotations as JSON (+/- the `-LD` knowhow), they can start using annotation data now with very little additional effort added to their stack
>>>> - if developers *want* to use PATCH, having the option of a patch format equally as "dump" (just dealing with keys and values, not triples), means (again) that they can start with (nearly) what they already know and have.
>>>> 
>>>> Here's the list of examples from the JSON Patch (RFC 6902) spec:
>>>> http://tools.ietf.org/html/rfc6902#appendix-A
>>>> 
>>>> 
>>>> I do not have a strong feeling on whether it is json patch or ldpatch, not really familiar with the details and certainly no experience. I. I have a slight preference to JSON, however; as far as I can see, LDPatch is based on a turtle syntax, and we did make a decision to put JSON-LD forward as our primary syntax in the model (in view of our constituency). In this respect JSON patch seems to be more in line with the rest.
>>>> 
>>>> I suppose it comes down to "cutting with the grain" of what's already in place--with the option to "be smarter" if you know how to be. :)
>>>> 
>>>> I'm all for having Turtle as an *option* and (if I have that) also having LDPatch as an *option.*
>>>> 
>>>> However, if these become mandatory, I fear we're cutting off a large part of the potential integration, implementation, and consuming developers.
>>>> 
>>>> I'd love to see annotation data as widely used as feeds were "back in the day."
>>>> 
>>>> I think it's possible, but (at least right now) I think that means keeping the "smarter" graph stuff as optional bits and not required defaults--as they are in LDP.
>>>> 
>>>> Ideally, we find a way to spec our Annotation API that is at once "simple" and also LDP compatible.
>>>> 
>>>> Is that feasible?
>>>> 
>>>> Did any of this make sense? :)
>>>> 
>>>> Thanks for listening regardless. ;)
>>>> 
>>>> Cheers,
>>>> Benjamin
>>>> --
>>>> Developer Advocate
>>>> http://hypothes.is/
>>>> 
>>>> 
>>>> (Maybe there is an Abis possibility: require JSON and LDPatch? Or is that too much?)
>>>> 
>>>> Ivan
>>>> 
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>> Rob
>>>>> 
>>>>> --
>>>>> Rob Sanderson
>>>>> Information Standards Advocate
>>>>> Digital Library Systems and Services
>>>>> Stanford, CA 94305
>>>> 
>>>> 
>>>> ----
>>>> Ivan Herman, W3C
>>>> Digital Publishing Activity Lead
>>>> Home: http://www.w3.org/People/Ivan/
>>>> mobile: +31-641044153
>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> ----
>>> Ivan Herman, W3C
>>> Digital Publishing Activity Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>> 
>> 
>> 
>> 
> 
> 
> 
> --
> Stian Soiland-Reyes, eScience Lab
> School of Computer Science
> The University of Manchester
> http://soiland-reyes.com/stian/work/    http://orcid.org/0000-0001-9842-9718


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Wednesday, 10 June 2015 13:12:10 UTC