Re: Turtle Trouble from Stian Soiland-Reyes on 2015-06-10 (public-annotation@w3.org from June 2015)

From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Date: Wed, 10 Jun 2015 17:10:10 +0100
To: Robert Casties <casties@mpiwg-berlin.mpg.de>
Cc: Annotation WG <public-annotation@w3.org>
Message-ID: <CAPRnXtn3cUo5JYUGOOB+7pUPNaEyKVUdZQrchbtYBZn-eCCO1Q@mail.gmail.com>
Gregg's http://rdf.greggkellogg.net/distiller is quite good - uses rdf.rb

So a quick redirect there (or internal POST) should work as long as
you are OK to depend on external services and exposing your data. :-)



On 10 June 2015 at 16:49, Robert Casties <casties@mpiwg-berlin.mpg.de> wrote:
> On 10.06.15 16:47, Stian Soiland-Reyes wrote:
>> Thanks, I understand your reasoning now.. having worked on a
>> standalone annotation server my mind was tainted.
>>
>> So you idea is for any odd web developer to just tack on the
>> Annotation API and just store and provide some JSON is quite simple,
>> without needing to know much about RDF. We should certainly support
>> that. Perhaps a middle-ground, givne that LDP is now a Specification,
>> would be to HTTP redirect to an online JSON-LD/Turtle translator web
>> service on Accept: text/turtle etc. We can hint at that in the
>> Annotation Protocol spec or tutorial, without linking to any service
>> in particular.
>
> Yes, that is a nice summary from my point of view as well. Such a
> translation library or service into Turtle/LDP would also be very welcome.
>
> Thanks
>         Robert
>
>> On 10 June 2015 at 15:13, Benjamin Young <bigbluehat@hypothes.is> wrote:
>>> On Wed, Jun 10, 2015 at 9:11 AM, Ivan Herman <ivan@w3.org> wrote:
>>>>
>>>>
>>>>> On 10 Jun 2015, at 15:03 , Stian Soiland-Reyes
>>>>> <soiland-reyes@cs.manchester.ac.uk> wrote:
>>>>>
>>>>> While I understand the desire to simplify server development, I don't
>>>>> see how it can be a big hurdle for a server implementation to generate
>>>>> Turtle from JSON-LD or vice versa as there is a plethoria of RDF
>>>>> support for almost any practical programming language (which one have
>>>>> you got in mind?),
>>>>
>>>> Stian,
>>>>
>>>> I believe the issue is that this statement may not be true. I am happy to
>>>> see Java is covered (I suspected that would be the case) and so is Ruby or
>>>> Python. But my experience with support in Javascript is not that good
>>>> (although it would be feasible to write a server in node.js). I am happy if
>>>> I am proven wrong, though.
>>>
>>>
>>>
>>> The core point is less about "are there tools / libraries available" and
>>> more about "how hard is it for developers to build a server or client."
>>>
>>> Right now, it looks like if they're building a server they'll need to (at
>>> least):
>>> a) know what Turtle is
>>> b) find a tool for their language to transform it into something they can
>>> store
>>> c) know how to transform it back to Turtle (for those who ask)
>>> d) know if they've done any of that correctly (which assumes they understand
>>> graphs, transformations, etc)
>>>
>>> For folks building a client, it's less complex:
>>> a) know how to send a proper Accept header
>>> b) know that JSON-LD can be treated as "just JSON"
>>> c) know how to follow links found in HTTP headers
>>>
>>> The client ones are hopefully pretty painless for anyone who's done "AJAX"
>>> in the last half decade. ;)
>>>
>>> The server ones, though, likely don't map to "most" (...I've not got a ruler
>>> handy...) developers--especially those who are not (and/or have not) worked
>>> with anything that thinks in graphs.
>>>
>>> In the NoSQL world (for one place), there are *loads* of databases that
>>> speak JSON on the wire and can store JSON-LD without any additional setup or
>>> work (Apache CouchDB, Basho's Riak, and MongoDB among them). CouchDB (at
>>> least) also nearly has matching semantics to LDP, and were it not for the
>>> Turtle requirement could be very simply wrapped to accept JSON-LD from an
>>> LDP client, store it, send it back when asked, and generate a container
>>> listing.
>>>
>>> I started down such a road--building a CouchApp that lives inside CouchDB
>>> with it's in-database JS engine (based on SpiderMonkey).
>>> https://github.com/BigBlueHat/ldp-on-couchdb
>>>
>>> All was well, until I hit the Turtle requirement, and then I got sucked in
>>> the undertow of data transformations. :-/
>>>
>>> I intend to revisit that project soon--ignoring the Turtle requirement for
>>> now--and see how far I get. It won't be an LDP server (...so it's name will
>>> eventually change...), but it will likely be a nearly matching server that
>>> would work for an Annotation API, "cost" developers little in terms of
>>> know-how to see what it's doing ("JSON goes in; JSON comes out"), and still
>>> be kind-a-sort-a close to the LDP spec (or at least as close as I can get
>>> it). :)
>>>
>>> Building LDP-based Annotation API servers on any of these other JSON stores
>>> will be similar. If there's a Turtle requirement, the implementer will have
>>> to a) care and b) know how to Do It Right (...both directions). I'm not sure
>>> that's most developers...
>>>
>>> Making Turtle optional, would solve that problem (afaik). Perhaps, the
>>> Annotation API looks like a limited sub-set of LDP. Perhaps it looks like an
>>> API who copied all the easy answers out of LDP's text book.
>>>
>>> Regardless, I do feel there's a good way forward, and that this group will
>>> find it. :)
>>>
>>> Thanks!
>>> Benjamin
>>> --
>>> Developer Advocate
>>> http://hypothes.is/
>>>
>>>
>>>>
>>>>
>>>> Ivan
>>>>
>>>>
>>>>> with as you are mentioning here, the option to call
>>>>> out to other binaries.
>>>>>
>>>>> As for generating, remember that N-Triples is valid Turtle and pretty
>>>>> easy to make.  The clients don't need to deal with both formats as
>>>>> they can just pick and stick with one of them.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Jena includes the "riot" command line tool which can be used
>>>>> independently for say Turtle to JSON-LD or JSON-LD to Turtle, e.g.:
>>>>>
>>>>> stain@biggie-utopic:~/Downloads$ riot --output=jsonld void.ttl.gz
>>>>> {
>>>>>  "@graph" : [ {
>>>>>    "@id" : "http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#",
>>>>>    "@type" : "void:DatasetDescription",
>>>>>    "description" : "This is the VoID description for a ChEMBL-RDF
>>>>> dataset",
>>>>>    "issued" : "2015-01-14T00:00:00.000Z",
>>>>>    "title" : "ChEMBL-RDF VoID Description",
>>>>>    "createdBy" : "http://orcid.org/0000-0002-8011-0300",
>>>>>    "createdOn" : "2009-10-28T00:00:00.000Z",
>>>>>    "lastUpdateOn" : "2015-01-14T00:00:00.000Z",
>>>>>    "previousVersion" :
>>>>> "http://rdf.ebi.ac.uk/dataset/chembl/19.0/void.ttl#",
>>>>>    "primaryTopic" : ":chembl_rdf_dataset"
>>>>>  }, {
>>>>> ...
>>>>>
>>>>> The auto-generated @context and @prefix are usually quite nice and
>>>>> makes readable JSON-LD, except that there is a bug in that it permits
>>>>> "" as a namespace prefix.  (fixed for the next release -
>>>>> https://issues.apache.org/jira/browse/JENA-934)
>>>>>
>>>>>
>>>>> You can combine this with the JSON-LD Java tool "jsonldplayground" to
>>>>> force a custom context or frame.
>>>>> https://github.com/jsonld-java/jsonld-java
>>>>>
>>>>> It can also read/write Turtle directly - but you would have to
>>>>> manually provide a --context to style it.
>>>>>
>>>>> stain@biggie-utopic:~/src/jsonld-java$ ./jsonldplayground
>>>>> Missing required option(s) [inputFile, process]
>>>>> Option (* = required)                  Description
>>>>> ---------------------                  -----------
>>>>> --base <base URI>                      (default: )
>>>>> --context <File: The context>
>>>>> --format [RDFFormat: The output file
>>>>>  format to use. Defaults to nquads.
>>>>>  Valid values are: [turtle, rdfjson,
>>>>>  rdfxml, trig, nquads, jsonld, trix,
>>>>>  ntriples]]
>>>>> --help
>>>>> * --inputFile <File: The input file>
>>>>> --outputForm [The way to output the    (default: expanded)
>>>>>  results from fromRDF. Defaults to
>>>>>  expanded. Valid values are:
>>>>>  [compacted, expanded, flattened]]
>>>>> * --process <The processing to
>>>>>  perform. Valid values are: [expand,
>>>>>  compact, frame, normalize, flatten,
>>>>>  fromrdf, tordf]>
>>>>>
>>>>> On 10 June 2015 at 06:01, Ivan Herman <ivan@w3.org> wrote:
>>>>>> The answer I have got is less upbeat than what I hoped for. Although
>>>>>> there are tools that either exist or can be done, the quality, mainly in
>>>>>> terms of human readability, is not equal. It seems that Gregg's ruby is the
>>>>>> only one that tries to create a reasonable default context using prefixes
>>>>>> defined in Turtle (or other serialization).
>>>>>>
>>>>>> In general, there aren't any purpose-built Turtle-to-JSON-LD libraries,
>>>>>> just as you don’t find RDFa-to-Turtle libraries, it’s usually done as part
>>>>>> of a system which includes multiple components. This is the case of the
>>>>>> solution I have outlined: RDFLib is a large (Python) RDF library; it has a
>>>>>> Turtle and a JSON-LD parser and serializer, ie, it is possible to use it as
>>>>>> a transformer. I would suspect (but I am not sure) that Jena has something
>>>>>> like that for Java, for example.
>>>>>>
>>>>>> However… coming back to the original issue, and also reflecting on
>>>>>> Robert Casties, we may have to think (maybe together with the LDP group)
>>>>>> whether it is acceptable to relief the requirements somehow, and turn the
>>>>>> Turtle version into an optional feature.
>>>>>>
>>>>>> Ivan
>>>>>>
>>>>>>
>>>>>> [[
>>>>>> Mine is the only one I’m aware of that tries to create a reasonable
>>>>>> default context using prefixes defined in Turtle (or other serialization).
>>>>>> As you know, the algorithm doesn’t describe a way to construct a context
>>>>>> automatically, but neither do any other RDF serializations, which focus on
>>>>>> parsing rather than generating. Jena may do something.
>>>>>>
>>>>>> Also note that the Linked Open Vocabularies [1] group maintain a
>>>>>> JSON-LD context with prefix definitions for all the vocabularies they
>>>>>> maintain, which can be used with any JSON-LD toolchain by compacting the
>>>>>> result using one or more context URLs, but won’t get vocabulary-specific
>>>>>> term definitions. Schema’s can be used for schema.org, which is probably the
>>>>>> best recommendation there. There’s a list of other supported context here
>>>>>> [2].
>>>>>>
>>>>>> Note that you won’t find many purpose-built Turtle-to-JSON-LD
>>>>>> libraries, just as you don’t find RDFa-to-Turtle libraries, it’s usually
>>>>>> done as part of a system which includes multiple components. In my case,
>>>>>> it’s possible to gather prefix definitions when parsing and forward them to
>>>>>> the serializer, which is what the distiller does. Other libraries may
>>>>>> provide a similar facility.
>>>>>>
>>>>>> In the case of an Annotation API server, I suspect they’re using a
>>>>>> particular vocabulary, or at least control what they use, so they’re
>>>>>> probably in the best position to create a context to apply to the JSON-LD
>>>>>> serializer, just as they likely manage the prefixes used in the Turtle
>>>>>> serialization; this will allow more control of the way the JSON-LD is
>>>>>> shaped, as would using it as a JSON-LD Frame, and the Framing algorithm. I
>>>>>> also have a tool to construct a context given an RDFS/OWL vocabulary [3].
>>>>>>
>>>>>> Probably best to suggest he ask on #jsonld, StackOverflow or
>>>>>> public-linked-json@w3.org for other input.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 08 Jun 2015, at 20:21 , Benjamin Young <bigbluehat@hypothes.is>
>>>>>>> wrote:
>>>>>>>
>>>>>>> On Mon, Jun 8, 2015 at 1:01 PM, Ivan Herman <ivan@w3.org> wrote:
>>>>>>>
>>>>>>>> On 08 Jun 2015, at 16:21 , Benjamin Young <bigbluehat@hypothes.is>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> This email's been in my head for awhile (too long probably) and this
>>>>>>>> patch thread tipped it to the point of my fingers typing what's in my brain.
>>>>>>>> :) So here goes...
>>>>>>>>
>>>>>>>> First (and foremost, maybe), I actually really like Turtle. :)
>>>>>>>>
>>>>>>>> However....it requires a way of thinking that the prevailing systems
>>>>>>>> (non-graph databases, browsers, JS runtimes, etc) don't currently think in.
>>>>>>>>
>>>>>>>> We've addressed that in the data model by preferring / promoting the
>>>>>>>> JSON-LD representation in examples--while still providing the Turtle
>>>>>>>> representation for those that support it.
>>>>>>>>
>>>>>>>> Where things fall down for me (at least) are with the protocol
>>>>>>>> specification--which, being based on LDP (which is otherwise quite fabulous)
>>>>>>>> comes with the requirement that:
>>>>>>>> http://www.w3.org/TR/ldp/#h4_ldprs-HTTP_GET
>>>>>>>>> ...MUST respond with a Turtle representation...when the request
>>>>>>>>> includes an Accept header specifying text/turtle
>>>>>>>>> ...SHOULD respond with a text/turtle...whenever the Accept request
>>>>>>>>> header is absent.
>>>>>>>>> ...MUST respond with a application/ld+json representation...when the
>>>>>>>>> request includes an Accept header specifying application/ld+json
>>>>>>>>
>>>>>>>> What this means practically (afaik) is that an Annotation API server
>>>>>>>> MUST be able to transform their stored info into both Turtle and JSON-LD
>>>>>>>> (regardless of which was sent in).
>>>>>>>>
>>>>>>>> There aren't (that I've found) terribly many Turtle-to-JSON-LD
>>>>>>>> transformation libraries. I've used this one (recently relicensed to Apache
>>>>>>>> License 2.0) with varied success:
>>>>>>>> https://github.com/warpr/turtle-to-jsonld
>>>>>>>
>>>>>>> I will ask around. I think if we allow for non-Javascript converters,
>>>>>>> too, then there are more. I know there is a JSON-LD module to RDFLib, so it
>>>>>>> is fairly easy to write a Python program to convert from one format to the
>>>>>>> other. Gregg Kellogg has a similar tool for Ruby. I think both are fairly
>>>>>>> good; the JSON-LD part was written by people from the JSON-LD group itself.
>>>>>>> I may get more info (from Gregg)
>>>>>>>
>>>>>>> (Note that I am mostly offline tomorrow, so the info may come on
>>>>>>> Wednesday only)
>>>>>>>
>>>>>>> Thanks, Ivan!
>>>>>>>
>>>>>>> Sorry if I misrepresented the "coverage area" of the tooling. The
>>>>>>> focus on JS was mostly browser-driven--which is where I currently expect
>>>>>>> most annotation clients live. Server-side stuff is more amenable of course.
>>>>>>> :)
>>>>>>>
>>>>>>> Thanks in advance for the links. It will be a useful list to reference
>>>>>>> whatever else we decide.
>>>>>>>
>>>>>>> Cheers!
>>>>>>> Benjamin
>>>>>>>
>>>>>>>
>>>>>>> ivan
>>>>>>>
>>>>>>>>
>>>>>>>> However, that (plus it's dependencies) provides a transformation to a
>>>>>>>> JSON-LD format that may actually not be what one wants in the end, and then
>>>>>>>> requires yet-more transformation and more understanding of the "meta model"
>>>>>>>> by the API server (and/or database) to move between the formats.
>>>>>>>>
>>>>>>>> Here's where this hits the PATCH format options....
>>>>>>>>
>>>>>>>> On Sat, Jun 6, 2015 at 12:57 AM, Ivan Herman <ivan@w3.org> wrote:
>>>>>>>>
>>>>>>>>> On 05 Jun 2015, at 20:51 , Robert Sanderson <azaroth42@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In reading back through the discussion at the face to face about the
>>>>>>>>> protocol draft, it was noted that there are many possible patch formats,
>>>>>>>>> including LDPatch, JSON Patch, Sparql Update, diff and so on.  All would be
>>>>>>>>> possible to use, and some easier in different circumstances.
>>>>>>>>>
>>>>>>>>> Do we want to:
>>>>>>>>>
>>>>>>>>> a)  Specify one as a requirement (MUST) and let the others be usable
>>>>>>>>> (MAY)
>>>>>>>>> b)  Not specify any as a requirement and just remain silent on which
>>>>>>>>> one to use.
>>>>>>>>>
>>>>>>>>> If B is the preference, then we would need to decide how the server
>>>>>>>>> advertises which of the PATCH formats it implements so that clients can
>>>>>>>>> determine how (if at all) they can interact.
>>>>>>>>>
>>>>>>>>> My preference is A, and to pick LDPatch (by reference) as part of
>>>>>>>>> the LDP stable of specifications, but what do people think?
>>>>>>>>
>>>>>>>> My preference is also A, although the issue of advertising may still
>>>>>>>> be relevant. ('may'. We may decide not to address this issue.)
>>>>>>>>
>>>>>>>> If the database or API server you are building supports a graph-based
>>>>>>>> "meta model" then supporting the transformation between Turtle and JSON-LD
>>>>>>>> or LDPatch or anything else triple-based is "just some more code." :)
>>>>>>>>
>>>>>>>> However, if your database does not (most databases don't...even if
>>>>>>>> they "speak" JSON), then handling LDPatch is an even farther reach than
>>>>>>>> supporting Turtle.
>>>>>>>>
>>>>>>>> Here's an LDPatch example for those who are curious:
>>>>>>>> http://www.w3.org/TR/ldpatch/#full-example
>>>>>>>>
>>>>>>>>
>>>>>>>>> Benjamin suggested at the F2F a preference for JSON Patch, for
>>>>>>>>> example.
>>>>>>>>
>>>>>>>> Good memory, Rob! :)
>>>>>>>>
>>>>>>>> The preference is completely along these lines:
>>>>>>>> - most available tooling supports JSON
>>>>>>>> - "understanding" the `-LD` bit of JSON-LD is at some level
>>>>>>>> "optional" (at least for storage, basic parsing, and transportation)
>>>>>>>> - if a PATCH format is chosen, it should be equally "dumb" (in the
>>>>>>>> best possible way) ;)
>>>>>>>>
>>>>>>>> Because:
>>>>>>>> - if developers can deal with annotations as JSON (+/- the `-LD`
>>>>>>>> knowhow), they can start using annotation data now with very little
>>>>>>>> additional effort added to their stack
>>>>>>>> - if developers *want* to use PATCH, having the option of a patch
>>>>>>>> format equally as "dump" (just dealing with keys and values, not triples),
>>>>>>>> means (again) that they can start with (nearly) what they already know and
>>>>>>>> have.
>>>>>>>>
>>>>>>>> Here's the list of examples from the JSON Patch (RFC 6902) spec:
>>>>>>>> http://tools.ietf.org/html/rfc6902#appendix-A
>>>>>>>>
>>>>>>>>
>>>>>>>> I do not have a strong feeling on whether it is json patch or
>>>>>>>> ldpatch, not really familiar with the details and certainly no experience.
>>>>>>>> I. I have a slight preference to JSON, however; as far as I can see, LDPatch
>>>>>>>> is based on a turtle syntax, and we did make a decision to put JSON-LD
>>>>>>>> forward as our primary syntax in the model (in view of our constituency). In
>>>>>>>> this respect JSON patch seems to be more in line with the rest.
>>>>>>>>
>>>>>>>> I suppose it comes down to "cutting with the grain" of what's already
>>>>>>>> in place--with the option to "be smarter" if you know how to be. :)
>>>>>>>>
>>>>>>>> I'm all for having Turtle as an *option* and (if I have that) also
>>>>>>>> having LDPatch as an *option.*
>>>>>>>>
>>>>>>>> However, if these become mandatory, I fear we're cutting off a large
>>>>>>>> part of the potential integration, implementation, and consuming developers.
>>>>>>>>
>>>>>>>> I'd love to see annotation data as widely used as feeds were "back in
>>>>>>>> the day."
>>>>>>>>
>>>>>>>> I think it's possible, but (at least right now) I think that means
>>>>>>>> keeping the "smarter" graph stuff as optional bits and not required
>>>>>>>> defaults--as they are in LDP.
>>>>>>>>
>>>>>>>> Ideally, we find a way to spec our Annotation API that is at once
>>>>>>>> "simple" and also LDP compatible.
>>>>>>>>
>>>>>>>> Is that feasible?
>>>>>>>>
>>>>>>>> Did any of this make sense? :)
>>>>>>>>
>>>>>>>> Thanks for listening regardless. ;)
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Benjamin
>>>>>>>> --
>>>>>>>> Developer Advocate
>>>>>>>> http://hypothes.is/
>>>>>>>>
>>>>>>>>
>>>>>>>> (Maybe there is an Abis possibility: require JSON and LDPatch? Or is
>>>>>>>> that too much?)
>>>>>>>>
>>>>>>>> Ivan
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Rob
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Rob Sanderson
>>>>>>>>> Information Standards Advocate
>>>>>>>>> Digital Library Systems and Services
>>>>>>>>> Stanford, CA 94305
>>>>>>>>
>>>>>>>>
>>>>>>>> ----
>>>>>>>> Ivan Herman, W3C
>>>>>>>> Digital Publishing Activity Lead
>>>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>>>> mobile: +31-641044153
>>>>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----
>>>>>>> Ivan Herman, W3C
>>>>>>> Digital Publishing Activity Lead
>>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>>> mobile: +31-641044153
>>>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ----
>>>>>> Ivan Herman, W3C
>>>>>> Digital Publishing Activity Lead
>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>> mobile: +31-641044153
>>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Stian Soiland-Reyes, eScience Lab
>>>>> School of Computer Science
>>>>> The University of Manchester
>>>>> http://soiland-reyes.com/stian/work/
>>>>> http://orcid.org/0000-0001-9842-9718
>>>>
>>>>
>>>> ----
>>>> Ivan Herman, W3C
>>>> Digital Publishing Activity Lead
>>>> Home: http://www.w3.org/People/Ivan/
>>>> mobile: +31-641044153
>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>
>
> --
> Dr. Robert Casties -- Information Technology Group
> Max Planck Institute for the History of Science
> Boltzmannstr. 22, D-14195 Berlin
> Tel: +49/30/22667-342 Fax: -299
>



-- 
Stian Soiland-Reyes, eScience Lab
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/    http://orcid.org/0000-0001-9842-9718
Received on Wednesday, 10 June 2015 16:11:07 UTC