Re: Framing and Query from Gregg Kellogg on 2016-10-13 (public-linked-json@w3.org from October 2016)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Wed, 12 Oct 2016 17:35:41 -0700
To: George Svarovsky <gsvarovsky@idbs.com>
Cc: Linked JSON <public-linked-json@w3.org>
Message-Id: <A058C0D2-DB1C-4BFC-84EE-203F7426D61B@greggkellogg.net>
> On Oct 12, 2016, at 3:39 AM, George Svarovsky <gsvarovsky@idbs.com> wrote:
> 
>>> On Oct 11, 2016, at 3:02 AM, George Svarovsky <gsvarovsky@idbs.com> wrote:
>>> 
>>> Hi Gregg, I'm glad to be here and I hope I can be of help.
>>> 
>>> I've taken the liberty of renaming this thread, and capturing the main recent salient points on this topic from the previous thread:
>>> 
>>> Gregg >>> Additionally, the Framing algorithm [2] has proven to be important, but work on the specification was never complete, and
>> implementations  have moved beyond what was documented in any case.
>>> Markus >> It is certainly handy but I'm not sure there's agreement on what exactly it should be. Initially it was just (or at least mostly)
>> about re-framing an existing graph... I think what a lot of people (myself included) actually want and need is to query a graph and control
>> the serialization of the result. Maybe we should start with a discussion on the role of framing!?
>>> George >> I have a particular interest in framing, and I concur with Markus that what I actually want is (some degree of) graph query.
>>> Gregg > I know there has been some discussion on more sophisticated querying, but I’m not aware of any specific proposals. And, for my
>> part, it seems to me that SPARQL Construct pretty much handles these use cases, other than for named graphs. It seems to me that trying
>> to do something very significant could easily be a rat-hole, but it’s worth a discussion.
>>>> 
>>>> Another possibility I considered at one point was a JSON-LD based query specification language that would parse to the SPARQL Abstract
>> Algebra (or simply generate SPARQL syntax), with triples derived from the JSON-LD used as the implicit dataset. This is probably more
>> constrained, and leaves the messy query bits to a mature specification. This is significant enough, that it probably requires a specification
>> separate from framing, and presumes that it’s the SPARQL syntax that is the issue being addressed.
>>> 
>>> The first internal POC I did with JSON-LD included a JSON query specification language, very closely related to a number of JSON query
>> syntaxes such as MongoDB, FreeBase, Backbone-ORM and TaffyDB. In common with these it was deliberately limited in its capabilities,
>> particularly for joins (ironically); but it was heavily invested in JSON-LD, effectively being a super-set with query operators. It was intended
>> to be backed by our native Oracle schema, but it actually found more traction as an API to JSON-LD in elasticsearch.
>>> 
>>> I can go into more detail on that if there's interest. But in the meantime, earlier this year another POC led me to using an actual
>> Triplestore for the first time, and I spent some happy hours fighting with constructing SPARQL in Node.js. Long story short, I ended up doing
>> precisely what you (Gregg) just suggested :) I've shared it on GitHub and NPM [1].
>> 
>> The fact that the data model for JSON-LD is, in fact, RDF, makes SPARQL a natural choice for doing queries. Of course, other graph query
>> algorithms could be adapted, but I suspect we’ll run into impedance issues, given that many of these are Property Graph based, not RDF
>> graph. Also, SPARQL gives the opportunity to include Entailment Regimes as part of the solution space. I would probably tend to start with a
>> more limited mapping to SPARQL Query, though.
>> 
>> Your JSON-RQL looks similar to what I was thinking, but I think we probably need separate @construct and @where sections, similar to how
>> SPARQL CONSTRUCT works.
> 
> Just to be clear (and I'm not remotely trying to sell it), json-rql just replaces the triple representation of SPARQL.js, which is just a JSON representation of the SPARQL 1.1 AST. So it's got WHERE and CONSTRUCT at the top level. So yes, I agree, and I think a fully baked json-rql would still basically be what you're suggesting. See below.
> 
>> GraphQL also looks interesting, and could be a natural for JSON-LD based on its syntax. However, I’m concerned that as we go through it,
>> we’ll find things that don’t match up as well given the RDF data model. But, there’s no reason that we would need to choose a single query
>> mechanism, and perhaps there’s room for both GraphQL- and SPARQL-based approaches.
>>>> I think there are several ways we could go:
>>>> 
>>>> 1) Improve framing based on the existing algorithms which provide some degree of manipulating and limiting the framed data based on
>> existing relationships.
>>>> 2) Consider a way to include a variable syntax, and how this might be
>>>> used for both matching and constructing data
>>> 
>>> While I'm a fan of query-by-example, I think in the general case there's too much complexity in interlacing the Query (pattern-matching
>> existing relationships), with the Frame (the structure I want to return). Personally, I've always ended up separating these concerns in the
>> syntax. However, I think it does come down to how powerful you want your query language to be. GraphQL [2] happily combines the two
>> into one tree, because its query syntax is very limited, deliberately. Trying to do the full power of SPARQL in this way would surely be messy.
>> But these languages have different, almost non-overlapping, sweet-spots--one is for building application APIs, the other for database APIs.
>> 
>> Indeed.
>> 
>>>> 3) Consider the implications of using SPARQL via de-serialization from JSON-LD to the RDF data model, performing a SPARQL query
>> operation, and re-serializing back to JSON-LD and framing using some variation of the existing algorithms.
>>> 
>>> I'm not sure what you mean here. Can you elaborate?
>> 
>> My though was to use SPARQL bouncing through RDF. Basically the following steps:
>> 
>> 1) Specify query in SPARQL, perhaps using a JSON-LD inspired syntactic variation mapping to the SPARQL Algebra.
>> 2) Turn the JSON-LD to be “framed” into RDF, and use as the dataset against which the SPARQL query (construct) is run.
>> 3) Serialize the constructed RDF using the format of the @construct clause hinted at above, to frame the results.
>> 
>> Just a wild shot at what this might look like:
>> {
>>  "@context": {
>>    "dc": "http://purl.org/dc/elements/1.1/",
>>    "ex": "http://example.org/vocab#"
>>  },
>>  "@construct": {
>>    "@id": "?lib",
>>    "@type": "ex:Library",
>>    "ex:contains": {
>>      "@id": "?book",
>>      "@type": "ex:Book",
>>      "dc:creator": "?creator",
>>      "?bp": "?bo",
>>      "ex:contains": {
>>        "@id": "?chapter",
>>        "@type": "ex:Chapter",
>>        "?cp": "?co"
>>      }
>>    }
>>  },
>>  "@where": {
>>    "@id": "?lib",
>>    "@type": "ex:Library",
>>    "ex:contains": {
>>      "@id": "?book",
>>      "@type": "ex:Book",
>>      "dc:creator": "?creator",
>>      "?bp": "?bo",
>>      "ex:contains": {
>>        "@id": "?chapter",
>>        "@type": "ex:Chapter",
>>        "?cp": "?co"
>>      }
>>    }
>>  }
>> }
>> 
>> 
>> The @construct part forms a frame, where objects are repeated as necessary based on subject matches. This roughly would translate to the
>> following SPARQL Query:
>> 
>> PREFIX dc11: <http://purl.org/dc/elements/1.1/>
>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
>> 
>> CONSTRUCT {
>>  ?lib a ex:Library; ex:contains ?book .
>>  ?book a ex:Book; dc:creator ?creator; ?bp ?bo .
>>  ?chapter a ex:Chapter; ?cp ?co .
>> }
>> WHERE {
>>  ?lib a ex:Library; ex:contains ?book .
>>  ?book a ex:Book; dc:creator ?creator; ?bp ?bo .
>>  ?chapter a ex:Chapter; ?cp ?co .
>> }
>> 
>> Or, directly to the Algebra:
>> 
>> (prefix
>> (
>>  (dc11: <http://purl.org/dc/elements/1.1/>)
>>  (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
>>  (xsd: <http://www.w3.org/2001/XMLSchema#>))
>> (construct
>>  (
>>   (triple ?lib a ex:Library)
>>   (triple ?lib ex:contains ?book)
>>   (triple ?book a ex:Book)
>>   (triple ?book dc:creator ?creator)
>>   (triple ?book ?bp ?bo)
>>   (triple ?chapter a ex:Chapter)
>>   (triple ?chapter ?cp ?co))
>>  (bgp
>>   (triple ?lib a ex:Library)
>>   (triple ?lib ex:contains ?book)
>>   (triple ?book a ex:Book)
>>   (triple ?book dc:creator ?creator)
>>   (triple ?book ?bp ?bo)
>>   (triple ?chapter a ex:Chapter)
>>   (triple ?chapter ?cp ?co)) ))
>> 
>> Of course, in this case, the @construct and @where bits are symmetrical, and perhaps there’s a shortcut for this case, but in general, the
>> @construct and @where are only related via variable bindings.
>> 
>> Gregg
> 
> Okay thanks, I get it now and I agree. I've updated json-rql to cope with CONSTRUCT (which was trivial), and my syntax looks like this:
> 
> {
>            '@context' : {
>                dc : 'http://purl.org/dc/elements/1.1/',
>                rdf : 'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
>                xsd : 'http://www.w3.org/2001/XMLSchema#',
>                ex : 'http://example.com/'
>            },
>            queryType : 'CONSTRUCT',
>            template : {
>                '@id' : '?lib',
>                '@type' : 'ex:Library',
>                'ex:contains' : {
>                    '@id' : '?book',
>                    '@type' : 'ex:Book',
>                    'dc:creator' : '?creator',
>                    '?bp' : '?bo',
>                    'ex:contains' : {
>                        '@id' : '?chapter',
>                        '@type' : 'ex:Chapter',
>                        '?cp' : '?co'
>                    }
>                }
>            },
>            where : {
>                '@id': '?lib',
>                '@type': 'ex:Library',
>                'ex:contains': {
>                    '@id': '?book',
>                    '@type': 'ex:Book',
>                    'dc:creator': '?creator',
>                    '?bp': '?bo',
>                    'ex:contains': {
>                        '@id': '?chapter',
>                        '@type': 'ex:Chapter',
>                        '?cp': '?co'
>                    }
>                }
>            }
> }
> 
> See https://github.com/gsvarovsky/json-rql/blob/master/test/constructExampleTest.js
> 
> Of course, the 'template' clause is not really a JSON-LD Frame, it just translates to SPARQL in the way you suggest. However, an end-point that accepts json-rql could interpret the 'template' clause as a Frame and return the JSON-LD suitably framed.

We’re pretty much on the same page. I prefer that we use “@“-prefixed keywords for such internals as `template` and `where`, but they can always be aliased in the @context. In this case, the `queryType` might be inferred by the use of `@template` (or `@construct`). I don’t know if there are use cases here for ASK, DESCRIBE, or SELECT, given that the results are a dataset and not a result set (maybe DESCRIBE).

I would say that if there’s a template/construct, that it is used as the form of the output, to achieve framing. If there is no @construct, then the `@where` is superfluous, and the body of that can be used both for the query and for the result format/frame, which matches most current use.

We could also consider supporting existing `[]`, and `{}` selectors, as mapping to filter elements (@filter).

Other things to consider support for: FILTER, OPTIONAL, UNION

Perhaps not to be considered for a first pass: FROM, Property Paths, Negation/EXISTS, Group/Aggregation, Subqueries, Federation

Note that most of this is not required for GraphQL equivalence, so we might consider just the subset that allows support for GraphQL (which, note is not JSON, itself). Perhaps a simple transliteration from GraphQL to a JSON-LD framing/query form which provides this level of support at a minimum, with features added from SPARQL 1.1 as necessary.

Gregg

>>>> I’m certainly interested in hearing suggestions on other approaches, along with some use cases/examples.
>>> 
>>> [1] https://github.com/gsvarovsky/json-rql
>>> [2] http://graphql.org/
>>> 
>>> -----Original Message-----
>>> From: Gregg Kellogg [mailto:gregg@greggkellogg.net]
>>> Sent: 10 October 2016 23:51
>>> To: George Svarovsky <gsvarovsky@idbs.com>
>>> Cc: Markus Lanthaler <markus.lanthaler@gmx.net>; Linked JSON
>>> <public-linked-json@w3.org>
>>> Subject: Re: Reactivating the CG to work on updated versions of the
>>> specs
>>> 
>>>> On Oct 10, 2016, at 2:32 AM, George Svarovsky <gsvarovsky@idbs.com> wrote:
>>>> 
>>>> Hi Markus & Gregg & everyone
>>> 
>>> Hi George, glad to have you! Please consider joining the Community Group [1], which simplifies IP issues.
>>> 
>>>> I've worked with JSON-LD since 2013, for IDBS internal POC work, including prototype APIs and indexing in elasticsearch. I'd like to make
>> it the lingua franca of our foundational APIs going forward. So although I'm not currently a 'heavy user', I'd like to become one! and I'd be
>> very happy to be involved in the new wave of progress.
>>>> 
>>>> I have a particular interest in framing, and I concur with Markus that what I actually want is (some degree of) graph query. I have some
>> thoughts, which I'll write out in a new thread.
>>> 
>>> I think there are several ways we could go:
>>> 
>>> 1) Improve framing based on the existing algorithms which provide some degree of manipulating and limiting the framed data based on
>> existing relationships.
>>> 2) Consider a way to include a variable syntax, and how this might be
>>> used for both matching and constructing data
>>> 3) Consider the implications of using SPARQL via de-serialization from JSON-LD to the RDF data model, performing a SPARQL query
>> operation, and re-serializing back to JSON-LD and framing using some variation of the existing algorithms.
>>> 
>>> I’m certainly interested in hearing suggestions on other approaches, along with some use cases/examples.
>>> 
>>>> Otherwise do let me know the best way I can help…
>>> 
>>> Excellent.
>>> 
>>>> George
>>>> 
>>>> George Svarovsky | Technical Director | IDBS gsvarovsky@idbs.com |
>>>> www.idbs.com | @gsvarovsky
>>> 
>>> Gregg
>>> 
>>> [1] https://www.w3.org/community/json-ld/participants
>>> 
>>>> -----Original Message-----
>>>> From: Markus Lanthaler [mailto:markus.lanthaler@gmx.net]
>>>> Sent: 10 October 2016 09:55
>>>> To: 'Linked JSON' <public-linked-json@w3.org>
>>>> Subject: RE: Reactivating the CG to work on updated versions of the
>>>> specs
>>>> 
>>>> It is great to see you taking the initiative on this Gregg!
>>>> 
>>>> On 30 Sep 2016 at 11:31, Gregg Kellogg wrote:
>>>>> JSON-LD 1.0 and JSON-LD API 1.0 have been out and successful for many years now.
>>>>> JSON-LD has succeeded beyond the wildest dreams of the CG, thanks to broad adoption.
>>>> 
>>>> Indeed!
>>>> 
>>>> 
>>>>> Additionally, the Framing algorithm [2] has proven to be important,
>>>>> but work on the specification was never complete, and
>>>>> implementations have moved beyond what was documented in any case.
>>>> 
>>>> It is certainly handy but I'm not sure there's agreement on what exactly it should be. Initially it was just (or at least mostly) about re-
>> framing an existing graph... I think what a lot of people (myself included) actually want and need is to query a graph and control the
>> serialization of the result. Maybe we should start with a discussion on the role of framing!?
>>>> 
>>>> 
>>>>> I think it’s time to get back to these documents to create a future
>>>>> 1.1 Community Group release of the specifications;
>>>> 
>>>> 1.1 sounds like minor tweaks to the existing official W3C specifications but some of the discussions and proposals I just saw go way
>> beyond that. What do you consider to be in scope for 1.1?
>>>> 
>>>> 
>>>>> At this point, I’d be happy to see active engagement on the mailing
>>>>> list to move these issues forward; I’m prepared to do the heavy
>>>>> lifting on the specification documents, and to maintain tests and my
>>>>> own Ruby implementation to match. Hopefully, other implementors and
>>>>> heavy users can actively engage in making this happen (perhaps an
>>>>> hour a week). It may be that we’ll want to start up the bi-weekly calls we used to discuss and resolve on these issues prior to moving
>> into the RDF WG.
>>>> 
>>>> I'd definitely like to help with this but unfortunately my spare cycles are quite limited.
>>>> 
>>>> 
>>>> Cheers,
>>>> Markus
>>>> 
>>>> 
>>>> --
>>>> Markus Lanthaler
>>>> @markuslanthaler
>>>> 
>>>> 
>>>> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you
>> may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
>>> 
>>> 
>>> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you
>> may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
> 
> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
Received on Thursday, 13 October 2016 00:36:28 UTC