Re: Turtle Trouble from Robert Casties on 2015-06-10 (public-annotation@w3.org from June 2015)

From: Robert Casties <casties@mpiwg-berlin.mpg.de>
Date: Wed, 10 Jun 2015 17:49:59 +0200
To: public-annotation@w3.org
Message-ID: <55785CA7.90209@mpiwg-berlin.mpg.de>
On 10.06.15 16:47, Stian Soiland-Reyes wrote:
> Thanks, I understand your reasoning now.. having worked on a
> standalone annotation server my mind was tainted.
> 
> So you idea is for any odd web developer to just tack on the
> Annotation API and just store and provide some JSON is quite simple,
> without needing to know much about RDF. We should certainly support
> that. Perhaps a middle-ground, givne that LDP is now a Specification,
> would be to HTTP redirect to an online JSON-LD/Turtle translator web
> service on Accept: text/turtle etc. We can hint at that in the
> Annotation Protocol spec or tutorial, without linking to any service
> in particular.

Yes, that is a nice summary from my point of view as well. Such a
translation library or service into Turtle/LDP would also be very welcome.

Thanks
 Robert

> On 10 June 2015 at 15:13, Benjamin Young <bigbluehat@hypothes.is> wrote:
>> On Wed, Jun 10, 2015 at 9:11 AM, Ivan Herman <ivan@w3.org> wrote:
>>>
>>>
>>>> On 10 Jun 2015, at 15:03 , Stian Soiland-Reyes
>>>> <soiland-reyes@cs.manchester.ac.uk> wrote:
>>>>
>>>> While I understand the desire to simplify server development, I don't
>>>> see how it can be a big hurdle for a server implementation to generate
>>>> Turtle from JSON-LD or vice versa as there is a plethoria of RDF
>>>> support for almost any practical programming language (which one have
>>>> you got in mind?),
>>>
>>> Stian,
>>>
>>> I believe the issue is that this statement may not be true. I am happy to
>>> see Java is covered (I suspected that would be the case) and so is Ruby or
>>> Python. But my experience with support in Javascript is not that good
>>> (although it would be feasible to write a server in node.js). I am happy if
>>> I am proven wrong, though.
>>
>>
>>
>> The core point is less about "are there tools / libraries available" and
>> more about "how hard is it for developers to build a server or client."
>>
>> Right now, it looks like if they're building a server they'll need to (at
>> least):
>> a) know what Turtle is
>> b) find a tool for their language to transform it into something they can
>> store
>> c) know how to transform it back to Turtle (for those who ask)
>> d) know if they've done any of that correctly (which assumes they understand
>> graphs, transformations, etc)
>>
>> For folks building a client, it's less complex:
>> a) know how to send a proper Accept header
>> b) know that JSON-LD can be treated as "just JSON"
>> c) know how to follow links found in HTTP headers
>>
>> The client ones are hopefully pretty painless for anyone who's done "AJAX"
>> in the last half decade. ;)
>>
>> The server ones, though, likely don't map to "most" (...I've not got a ruler
>> handy...) developers--especially those who are not (and/or have not) worked
>> with anything that thinks in graphs.
>>
>> In the NoSQL world (for one place), there are *loads* of databases that
>> speak JSON on the wire and can store JSON-LD without any additional setup or
>> work (Apache CouchDB, Basho's Riak, and MongoDB among them). CouchDB (at
>> least) also nearly has matching semantics to LDP, and were it not for the
>> Turtle requirement could be very simply wrapped to accept JSON-LD from an
>> LDP client, store it, send it back when asked, and generate a container
>> listing.
>>
>> I started down such a road--building a CouchApp that lives inside CouchDB
>> with it's in-database JS engine (based on SpiderMonkey).
>> https://github.com/BigBlueHat/ldp-on-couchdb
>>
>> All was well, until I hit the Turtle requirement, and then I got sucked in
>> the undertow of data transformations. :-/
>>
>> I intend to revisit that project soon--ignoring the Turtle requirement for
>> now--and see how far I get. It won't be an LDP server (...so it's name will
>> eventually change...), but it will likely be a nearly matching server that
>> would work for an Annotation API, "cost" developers little in terms of
>> know-how to see what it's doing ("JSON goes in; JSON comes out"), and still
>> be kind-a-sort-a close to the LDP spec (or at least as close as I can get
>> it). :)
>>
>> Building LDP-based Annotation API servers on any of these other JSON stores
>> will be similar. If there's a Turtle requirement, the implementer will have
>> to a) care and b) know how to Do It Right (...both directions). I'm not sure
>> that's most developers...
>>
>> Making Turtle optional, would solve that problem (afaik). Perhaps, the
>> Annotation API looks like a limited sub-set of LDP. Perhaps it looks like an
>> API who copied all the easy answers out of LDP's text book.
>>
>> Regardless, I do feel there's a good way forward, and that this group will
>> find it. :)
>>
>> Thanks!
>> Benjamin
>> --
>> Developer Advocate
>> http://hypothes.is/
>>
>>
>>>
>>>
>>> Ivan
>>>
>>>
>>>> with as you are mentioning here, the option to call
>>>> out to other binaries.
>>>>
>>>> As for generating, remember that N-Triples is valid Turtle and pretty
>>>> easy to make.  The clients don't need to deal with both formats as
>>>> they can just pick and stick with one of them.
>>>>
>>>>
>>>>
>>>>
>>>> Jena includes the "riot" command line tool which can be used
>>>> independently for say Turtle to JSON-LD or JSON-LD to Turtle, e.g.:
>>>>
>>>> stain@biggie-utopic:~/Downloads$ riot --output=jsonld void.ttl.gz
>>>> {
>>>>  "@graph" : [ {
>>>>    "@id" : "http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#",
>>>>    "@type" : "void:DatasetDescription",
>>>>    "description" : "This is the VoID description for a ChEMBL-RDF
>>>> dataset",
>>>>    "issued" : "2015-01-14T00:00:00.000Z",
>>>>    "title" : "ChEMBL-RDF VoID Description",
>>>>    "createdBy" : "http://orcid.org/0000-0002-8011-0300",
>>>>    "createdOn" : "2009-10-28T00:00:00.000Z",
>>>>    "lastUpdateOn" : "2015-01-14T00:00:00.000Z",
>>>>    "previousVersion" :
>>>> "http://rdf.ebi.ac.uk/dataset/chembl/19.0/void.ttl#",
>>>>    "primaryTopic" : ":chembl_rdf_dataset"
>>>>  }, {
>>>> ...
>>>>
>>>> The auto-generated @context and @prefix are usually quite nice and
>>>> makes readable JSON-LD, except that there is a bug in that it permits
>>>> "" as a namespace prefix.  (fixed for the next release -
>>>> https://issues.apache.org/jira/browse/JENA-934)
>>>>
>>>>
>>>> You can combine this with the JSON-LD Java tool "jsonldplayground" to
>>>> force a custom context or frame.
>>>> https://github.com/jsonld-java/jsonld-java
>>>>
>>>> It can also read/write Turtle directly - but you would have to
>>>> manually provide a --context to style it.
>>>>
>>>> stain@biggie-utopic:~/src/jsonld-java$ ./jsonldplayground
>>>> Missing required option(s) [inputFile, process]
>>>> Option (* = required)                  Description
>>>> ---------------------                  -----------
>>>> --base <base URI>                      (default: )
>>>> --context <File: The context>
>>>> --format [RDFFormat: The output file
>>>>  format to use. Defaults to nquads.
>>>>  Valid values are: [turtle, rdfjson,
>>>>  rdfxml, trig, nquads, jsonld, trix,
>>>>  ntriples]]
>>>> --help
>>>> * --inputFile <File: The input file>
>>>> --outputForm [The way to output the    (default: expanded)
>>>>  results from fromRDF. Defaults to
>>>>  expanded. Valid values are:
>>>>  [compacted, expanded, flattened]]
>>>> * --process <The processing to
>>>>  perform. Valid values are: [expand,
>>>>  compact, frame, normalize, flatten,
>>>>  fromrdf, tordf]>
>>>>
>>>> On 10 June 2015 at 06:01, Ivan Herman <ivan@w3.org> wrote:
>>>>> The answer I have got is less upbeat than what I hoped for. Although
>>>>> there are tools that either exist or can be done, the quality, mainly in
>>>>> terms of human readability, is not equal. It seems that Gregg's ruby is the
>>>>> only one that tries to create a reasonable default context using prefixes
>>>>> defined in Turtle (or other serialization).
>>>>>
>>>>> In general, there aren't any purpose-built Turtle-to-JSON-LD libraries,
>>>>> just as you don’t find RDFa-to-Turtle libraries, it’s usually done as part
>>>>> of a system which includes multiple components. This is the case of the
>>>>> solution I have outlined: RDFLib is a large (Python) RDF library; it has a
>>>>> Turtle and a JSON-LD parser and serializer, ie, it is possible to use it as
>>>>> a transformer. I would suspect (but I am not sure) that Jena has something
>>>>> like that for Java, for example.
>>>>>
>>>>> However… coming back to the original issue, and also reflecting on
>>>>> Robert Casties, we may have to think (maybe together with the LDP group)
>>>>> whether it is acceptable to relief the requirements somehow, and turn the
>>>>> Turtle version into an optional feature.
>>>>>
>>>>> Ivan
>>>>>
>>>>>
>>>>> [[
>>>>> Mine is the only one I’m aware of that tries to create a reasonable
>>>>> default context using prefixes defined in Turtle (or other serialization).
>>>>> As you know, the algorithm doesn’t describe a way to construct a context
>>>>> automatically, but neither do any other RDF serializations, which focus on
>>>>> parsing rather than generating. Jena may do something.
>>>>>
>>>>> Also note that the Linked Open Vocabularies [1] group maintain a
>>>>> JSON-LD context with prefix definitions for all the vocabularies they
>>>>> maintain, which can be used with any JSON-LD toolchain by compacting the
>>>>> result using one or more context URLs, but won’t get vocabulary-specific
>>>>> term definitions. Schema’s can be used for schema.org, which is probably the
>>>>> best recommendation there. There’s a list of other supported context here
>>>>> [2].
>>>>>
>>>>> Note that you won’t find many purpose-built Turtle-to-JSON-LD
>>>>> libraries, just as you don’t find RDFa-to-Turtle libraries, it’s usually
>>>>> done as part of a system which includes multiple components. In my case,
>>>>> it’s possible to gather prefix definitions when parsing and forward them to
>>>>> the serializer, which is what the distiller does. Other libraries may
>>>>> provide a similar facility.
>>>>>
>>>>> In the case of an Annotation API server, I suspect they’re using a
>>>>> particular vocabulary, or at least control what they use, so they’re
>>>>> probably in the best position to create a context to apply to the JSON-LD
>>>>> serializer, just as they likely manage the prefixes used in the Turtle
>>>>> serialization; this will allow more control of the way the JSON-LD is
>>>>> shaped, as would using it as a JSON-LD Frame, and the Framing algorithm. I
>>>>> also have a tool to construct a context given an RDFS/OWL vocabulary [3].
>>>>>
>>>>> Probably best to suggest he ask on #jsonld, StackOverflow or
>>>>> public-linked-json@w3.org for other input.
>>>>>
>>>>>
>>>>>
>>>>>> On 08 Jun 2015, at 20:21 , Benjamin Young <bigbluehat@hypothes.is>
>>>>>> wrote:
>>>>>>
>>>>>> On Mon, Jun 8, 2015 at 1:01 PM, Ivan Herman <ivan@w3.org> wrote:
>>>>>>
>>>>>>> On 08 Jun 2015, at 16:21 , Benjamin Young <bigbluehat@hypothes.is>
>>>>>>> wrote:
>>>>>>>
>>>>>>> This email's been in my head for awhile (too long probably) and this
>>>>>>> patch thread tipped it to the point of my fingers typing what's in my brain.
>>>>>>> :) So here goes...
>>>>>>>
>>>>>>> First (and foremost, maybe), I actually really like Turtle. :)
>>>>>>>
>>>>>>> However....it requires a way of thinking that the prevailing systems
>>>>>>> (non-graph databases, browsers, JS runtimes, etc) don't currently think in.
>>>>>>>
>>>>>>> We've addressed that in the data model by preferring / promoting the
>>>>>>> JSON-LD representation in examples--while still providing the Turtle
>>>>>>> representation for those that support it.
>>>>>>>
>>>>>>> Where things fall down for me (at least) are with the protocol
>>>>>>> specification--which, being based on LDP (which is otherwise quite fabulous)
>>>>>>> comes with the requirement that:
>>>>>>> http://www.w3.org/TR/ldp/#h4_ldprs-HTTP_GET
>>>>>>>> ...MUST respond with a Turtle representation...when the request
>>>>>>>> includes an Accept header specifying text/turtle
>>>>>>>> ...SHOULD respond with a text/turtle...whenever the Accept request
>>>>>>>> header is absent.
>>>>>>>> ...MUST respond with a application/ld+json representation...when the
>>>>>>>> request includes an Accept header specifying application/ld+json
>>>>>>>
>>>>>>> What this means practically (afaik) is that an Annotation API server
>>>>>>> MUST be able to transform their stored info into both Turtle and JSON-LD
>>>>>>> (regardless of which was sent in).
>>>>>>>
>>>>>>> There aren't (that I've found) terribly many Turtle-to-JSON-LD
>>>>>>> transformation libraries. I've used this one (recently relicensed to Apache
>>>>>>> License 2.0) with varied success:
>>>>>>> https://github.com/warpr/turtle-to-jsonld
>>>>>>
>>>>>> I will ask around. I think if we allow for non-Javascript converters,
>>>>>> too, then there are more. I know there is a JSON-LD module to RDFLib, so it
>>>>>> is fairly easy to write a Python program to convert from one format to the
>>>>>> other. Gregg Kellogg has a similar tool for Ruby. I think both are fairly
>>>>>> good; the JSON-LD part was written by people from the JSON-LD group itself.
>>>>>> I may get more info (from Gregg)
>>>>>>
>>>>>> (Note that I am mostly offline tomorrow, so the info may come on
>>>>>> Wednesday only)
>>>>>>
>>>>>> Thanks, Ivan!
>>>>>>
>>>>>> Sorry if I misrepresented the "coverage area" of the tooling. The
>>>>>> focus on JS was mostly browser-driven--which is where I currently expect
>>>>>> most annotation clients live. Server-side stuff is more amenable of course.
>>>>>> :)
>>>>>>
>>>>>> Thanks in advance for the links. It will be a useful list to reference
>>>>>> whatever else we decide.
>>>>>>
>>>>>> Cheers!
>>>>>> Benjamin
>>>>>>
>>>>>>
>>>>>> ivan
>>>>>>
>>>>>>>
>>>>>>> However, that (plus it's dependencies) provides a transformation to a
>>>>>>> JSON-LD format that may actually not be what one wants in the end, and then
>>>>>>> requires yet-more transformation and more understanding of the "meta model"
>>>>>>> by the API server (and/or database) to move between the formats.
>>>>>>>
>>>>>>> Here's where this hits the PATCH format options....
>>>>>>>
>>>>>>> On Sat, Jun 6, 2015 at 12:57 AM, Ivan Herman <ivan@w3.org> wrote:
>>>>>>>
>>>>>>>> On 05 Jun 2015, at 20:51 , Robert Sanderson <azaroth42@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> In reading back through the discussion at the face to face about the
>>>>>>>> protocol draft, it was noted that there are many possible patch formats,
>>>>>>>> including LDPatch, JSON Patch, Sparql Update, diff and so on.  All would be
>>>>>>>> possible to use, and some easier in different circumstances.
>>>>>>>>
>>>>>>>> Do we want to:
>>>>>>>>
>>>>>>>> a)  Specify one as a requirement (MUST) and let the others be usable
>>>>>>>> (MAY)
>>>>>>>> b)  Not specify any as a requirement and just remain silent on which
>>>>>>>> one to use.
>>>>>>>>
>>>>>>>> If B is the preference, then we would need to decide how the server
>>>>>>>> advertises which of the PATCH formats it implements so that clients can
>>>>>>>> determine how (if at all) they can interact.
>>>>>>>>
>>>>>>>> My preference is A, and to pick LDPatch (by reference) as part of
>>>>>>>> the LDP stable of specifications, but what do people think?
>>>>>>>
>>>>>>> My preference is also A, although the issue of advertising may still
>>>>>>> be relevant. ('may'. We may decide not to address this issue.)
>>>>>>>
>>>>>>> If the database or API server you are building supports a graph-based
>>>>>>> "meta model" then supporting the transformation between Turtle and JSON-LD
>>>>>>> or LDPatch or anything else triple-based is "just some more code." :)
>>>>>>>
>>>>>>> However, if your database does not (most databases don't...even if
>>>>>>> they "speak" JSON), then handling LDPatch is an even farther reach than
>>>>>>> supporting Turtle.
>>>>>>>
>>>>>>> Here's an LDPatch example for those who are curious:
>>>>>>> http://www.w3.org/TR/ldpatch/#full-example
>>>>>>>
>>>>>>>
>>>>>>>> Benjamin suggested at the F2F a preference for JSON Patch, for
>>>>>>>> example.
>>>>>>>
>>>>>>> Good memory, Rob! :)
>>>>>>>
>>>>>>> The preference is completely along these lines:
>>>>>>> - most available tooling supports JSON
>>>>>>> - "understanding" the `-LD` bit of JSON-LD is at some level
>>>>>>> "optional" (at least for storage, basic parsing, and transportation)
>>>>>>> - if a PATCH format is chosen, it should be equally "dumb" (in the
>>>>>>> best possible way) ;)
>>>>>>>
>>>>>>> Because:
>>>>>>> - if developers can deal with annotations as JSON (+/- the `-LD`
>>>>>>> knowhow), they can start using annotation data now with very little
>>>>>>> additional effort added to their stack
>>>>>>> - if developers *want* to use PATCH, having the option of a patch
>>>>>>> format equally as "dump" (just dealing with keys and values, not triples),
>>>>>>> means (again) that they can start with (nearly) what they already know and
>>>>>>> have.
>>>>>>>
>>>>>>> Here's the list of examples from the JSON Patch (RFC 6902) spec:
>>>>>>> http://tools.ietf.org/html/rfc6902#appendix-A
>>>>>>>
>>>>>>>
>>>>>>> I do not have a strong feeling on whether it is json patch or
>>>>>>> ldpatch, not really familiar with the details and certainly no experience.
>>>>>>> I. I have a slight preference to JSON, however; as far as I can see, LDPatch
>>>>>>> is based on a turtle syntax, and we did make a decision to put JSON-LD
>>>>>>> forward as our primary syntax in the model (in view of our constituency). In
>>>>>>> this respect JSON patch seems to be more in line with the rest.
>>>>>>>
>>>>>>> I suppose it comes down to "cutting with the grain" of what's already
>>>>>>> in place--with the option to "be smarter" if you know how to be. :)
>>>>>>>
>>>>>>> I'm all for having Turtle as an *option* and (if I have that) also
>>>>>>> having LDPatch as an *option.*
>>>>>>>
>>>>>>> However, if these become mandatory, I fear we're cutting off a large
>>>>>>> part of the potential integration, implementation, and consuming developers.
>>>>>>>
>>>>>>> I'd love to see annotation data as widely used as feeds were "back in
>>>>>>> the day."
>>>>>>>
>>>>>>> I think it's possible, but (at least right now) I think that means
>>>>>>> keeping the "smarter" graph stuff as optional bits and not required
>>>>>>> defaults--as they are in LDP.
>>>>>>>
>>>>>>> Ideally, we find a way to spec our Annotation API that is at once
>>>>>>> "simple" and also LDP compatible.
>>>>>>>
>>>>>>> Is that feasible?
>>>>>>>
>>>>>>> Did any of this make sense? :)
>>>>>>>
>>>>>>> Thanks for listening regardless. ;)
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Benjamin
>>>>>>> --
>>>>>>> Developer Advocate
>>>>>>> http://hypothes.is/
>>>>>>>
>>>>>>>
>>>>>>> (Maybe there is an Abis possibility: require JSON and LDPatch? Or is
>>>>>>> that too much?)
>>>>>>>
>>>>>>> Ivan
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Rob
>>>>>>>>
>>>>>>>> --
>>>>>>>> Rob Sanderson
>>>>>>>> Information Standards Advocate
>>>>>>>> Digital Library Systems and Services
>>>>>>>> Stanford, CA 94305
>>>>>>>
>>>>>>>
>>>>>>> ----
>>>>>>> Ivan Herman, W3C
>>>>>>> Digital Publishing Activity Lead
>>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>>> mobile: +31-641044153
>>>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ----
>>>>>> Ivan Herman, W3C
>>>>>> Digital Publishing Activity Lead
>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>> mobile: +31-641044153
>>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> ----
>>>>> Ivan Herman, W3C
>>>>> Digital Publishing Activity Lead
>>>>> Home: http://www.w3.org/People/Ivan/
>>>>> mobile: +31-641044153
>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Stian Soiland-Reyes, eScience Lab
>>>> School of Computer Science
>>>> The University of Manchester
>>>> http://soiland-reyes.com/stian/work/
>>>> http://orcid.org/0000-0001-9842-9718
>>>
>>>
>>> ----
>>> Ivan Herman, W3C
>>> Digital Publishing Activity Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>
>>>
>>>
>>>
>>
> 
> 
> 


-- 
Dr. Robert Casties -- Information Technology Group
Max Planck Institute for the History of Science
Boltzmannstr. 22, D-14195 Berlin
Tel: +49/30/22667-342 Fax: -299
Received on Wednesday, 10 June 2015 15:50:36 UTC