Re: Framing and Query from Gregg Kellogg on 2016-10-30 (public-linked-json@w3.org from October 2016)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Sun, 30 Oct 2016 08:45:19 -0700
To: George Svarovsky <gsvarovsky@idbs.com>
Cc: Linked JSON <public-linked-json@w3.org>
Message-Id: <185ABC6F-4DC5-48C6-AF31-5B473A609260@greggkellogg.net>
Hi George. It would be useful to start an issue on JSON-LD query including these ideas. This can be marked as “consider later”, but will be easier to find than as an entry on the mailing list.

Gregg Kellogg
gregg@greggkellogg.net

> On Oct 30, 2016, at 3:51 AM, George Svarovsky <gsvarovsky@idbs.com> wrote:
> 
> Hi All
> 
> I've spent a few happy hours updating json-rql (https://www.npmjs.com/package/json-rql <https://www.npmjs.com/package/json-rql>) to conform to the ideas Gregg and I briefly bounced around some weeks ago, and expanded the SPARQL conversion code to cover a number of examples in the test/sparql package. e.g.:
> 
> {
>   "@context": {
>     "rdfs": "http://www.w3.org/2000/01/rdf-schema# <http://www.w3.org/2000/01/rdf-schema#>",
>     "dbpedia": "http://dbpedia.org/resource/ <http://dbpedia.org/resource/>",
>     "dbpedia-owl": "http://dbpedia.org/ontology/ <http://dbpedia.org/ontology/>"
>   },
>   "@construct": {
>     "@id": "?person",
>     "@type": "dbpedia-owl:Artist",
>     "dbpedia-owl:birthPlace": "?city",
>     "rdfs:label": "?name"
>   },
>   "@where": {
>     "@id": "?person",
>     "@type": "dbpedia-owl:Artist",
>     "dbpedia-owl:birthPlace": {
>       "@id": "?city",
>       "dbpedia-owl:country": {
>         "@id": "?country",
>         "rdfs:label": {"@language": "en", "@value": "Belgium"}
>       },
>       "rdfs:label": ["?cityName", {"@language": "en", "@value": "Ghent"}]
>     },
>     "rdfs:label": "?name"
>   }
> }
> 
> Note again that this idea is separate to Framing, and is intended for query use-cases where JSON might offer be a more convenient syntax than direct SPARQL.
> 
> With the prevailing focus on Framing, I don't intend to move this any further forward (or elaborate more justification/examples), for now.
> 
> Best regards
> 
> George
> 
> From: Gregg Kellogg <gregg@greggkellogg.net <mailto:gregg@greggkellogg.net>>
> Sent: 13 October 2016 01:35
> To: George Svarovsky
> Cc: Linked JSON
> Subject: Re: Framing and Query
>  
> > On Oct 12, 2016, at 3:39 AM, George Svarovsky <gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com>> wrote:
> > 
> >>> On Oct 11, 2016, at 3:02 AM, George Svarovsky <gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com>> wrote:
> >>> 
> >>> Hi Gregg, I'm glad to be here and I hope I can be of help.
> >>> 
> >>> I've taken the liberty of renaming this thread, and capturing the main recent salient points on this topic from the previous thread:
> >>> 
> >>> Gregg >>> Additionally, the Framing algorithm [2] has proven to be important, but work on the specification was never complete, and
> >> implementations  have moved beyond what was documented in any case.
> >>> Markus >> It is certainly handy but I'm not sure there's agreement on what exactly it should be. Initially it was just (or at least mostly)
> >> about re-framing an existing graph... I think what a lot of people (myself included) actually want and need is to query a graph and control
> >> the serialization of the result. Maybe we should start with a discussion on the role of framing!?
> >>> George >> I have a particular interest in framing, and I concur with Markus that what I actually want is (some degree of) graph query.
> >>> Gregg > I know there has been some discussion on more sophisticated querying, but I’m not aware of any specific proposals. And, for my
> >> part, it seems to me that SPARQL Construct pretty much handles these use cases, other than for named graphs. It seems to me that trying
> >> to do something very significant could easily be a rat-hole, but it’s worth a discussion.
> >>>> 
> >>>> Another possibility I considered at one point was a JSON-LD based query specification language that would parse to the SPARQL Abstract
> >> Algebra (or simply generate SPARQL syntax), with triples derived from the JSON-LD used as the implicit dataset. This is probably more
> >> constrained, and leaves the messy query bits to a mature specification. This is significant enough, that it probably requires a specification
> >> separate from framing, and presumes that it’s the SPARQL syntax that is the issue being addressed.
> >>> 
> >>> The first internal POC I did with JSON-LD included a JSON query specification language, very closely related to a number of JSON query
> >> syntaxes such as MongoDB, FreeBase, Backbone-ORM and TaffyDB. In common with these it was deliberately limited in its capabilities,
> >> particularly for joins (ironically); but it was heavily invested in JSON-LD, effectively being a super-set with query operators. It was intended
> >> to be backed by our native Oracle schema, but it actually found more traction as an API to JSON-LD in elasticsearch.
> >>> 
> >>> I can go into more detail on that if there's interest. But in the meantime, earlier this year another POC led me to using an actual
> >> Triplestore for the first time, and I spent some happy hours fighting with constructing SPARQL in Node.js. Long story short, I ended up doing
> >> precisely what you (Gregg) just suggested :) I've shared it on GitHub and NPM [1].
> >> 
> >> The fact that the data model for JSON-LD is, in fact, RDF, makes SPARQL a natural choice for doing queries. Of course, other graph query
> >> algorithms could be adapted, but I suspect we’ll run into impedance issues, given that many of these are Property Graph based, not RDF
> >> graph. Also, SPARQL gives the opportunity to include Entailment Regimes as part of the solution space. I would probably tend to start with a
> >> more limited mapping to SPARQL Query, though.
> >> 
> >> Your JSON-RQL looks similar to what I was thinking, but I think we probably need separate @construct and @where sections, similar to how
> >> SPARQL CONSTRUCT works.
> > 
> > Just to be clear (and I'm not remotely trying to sell it), json-rql just replaces the triple representation of SPARQL.js, which is just a JSON representation of the SPARQL 1.1 AST. So it's got WHERE and CONSTRUCT at the top level. So yes, I agree, and I think a fully baked json-rql would still basically be what you're suggesting. See below.
> > 
> >> GraphQL also looks interesting, and could be a natural for JSON-LD based on its syntax. However, I’m concerned that as we go through it,
> >> we’ll find things that don’t match up as well given the RDF data model. But, there’s no reason that we would need to choose a single query
> >> mechanism, and perhaps there’s room for both GraphQL- and SPARQL-based approaches.
> >>>> I think there are several ways we could go:
> >>>> 
> >>>> 1) Improve framing based on the existing algorithms which provide some degree of manipulating and limiting the framed data based on
> >> existing relationships.
> >>>> 2) Consider a way to include a variable syntax, and how this might be
> >>>> used for both matching and constructing data
> >>> 
> >>> While I'm a fan of query-by-example, I think in the general case there's too much complexity in interlacing the Query (pattern-matching
> >> existing relationships), with the Frame (the structure I want to return). Personally, I've always ended up separating these concerns in the
> >> syntax. However, I think it does come down to how powerful you want your query language to be. GraphQL [2] happily combines the two
> >> into one tree, because its query syntax is very limited, deliberately. Trying to do the full power of SPARQL in this way would surely be messy.
> >> But these languages have different, almost non-overlapping, sweet-spots--one is for building application APIs, the other for database APIs.
> >> 
> >> Indeed.
> >> 
> >>>> 3) Consider the implications of using SPARQL via de-serialization from JSON-LD to the RDF data model, performing a SPARQL query
> >> operation, and re-serializing back to JSON-LD and framing using some variation of the existing algorithms.
> >>> 
> >>> I'm not sure what you mean here. Can you elaborate?
> >> 
> >> My though was to use SPARQL bouncing through RDF. Basically the following steps:
> >> 
> >> 1) Specify query in SPARQL, perhaps using a JSON-LD inspired syntactic variation mapping to the SPARQL Algebra.
> >> 2) Turn the JSON-LD to be “framed” into RDF, and use as the dataset against which the SPARQL query (construct) is run.
> >> 3) Serialize the constructed RDF using the format of the @construct clause hinted at above, to frame the results.
> >> 
> >> Just a wild shot at what this might look like:
> >> {
> >>  "@context": {
> >>    "dc": "http://purl.org/dc/elements/1.1/ <http://purl.org/dc/elements/1.1/>",
> >>    "ex": "http://example.org/vocab# <http://example.org/vocab#>"
> >>  },
> >>  "@construct": {
> >>    "@id": "?lib",
> >>    "@type": "ex:Library",
> >>    "ex:contains": {
> >>      "@id": "?book",
> >>      "@type": "ex:Book",
> >>      "dc:creator": "?creator",
> >>      "?bp": "?bo",
> >>      "ex:contains": {
> >>        "@id": "?chapter",
> >>        "@type": "ex:Chapter",
> >>        "?cp": "?co"
> >>      }
> >>    }
> >>  },
> >>  "@where": {
> >>    "@id": "?lib",
> >>    "@type": "ex:Library",
> >>    "ex:contains": {
> >>      "@id": "?book",
> >>      "@type": "ex:Book",
> >>      "dc:creator": "?creator",
> >>      "?bp": "?bo",
> >>      "ex:contains": {
> >>        "@id": "?chapter",
> >>        "@type": "ex:Chapter",
> >>        "?cp": "?co"
> >>      }
> >>    }
> >>  }
> >> }
> >> 
> >> 
> >> The @construct part forms a frame, where objects are repeated as necessary based on subject matches. This roughly would translate to the
> >> following SPARQL Query:
> >> 
> >> PREFIX dc11: <http://purl.org/dc/elements/1.1/ <http://purl.org/dc/elements/1.1/>>
> >> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>>
> >> PREFIX xsd: <http://www.w3.org/2001/XMLSchema# <http://www.w3.org/2001/XMLSchema#>>
> >> 
> >> CONSTRUCT {
> >>  ?lib a ex:Library; ex:contains ?book .
> >>  ?book a ex:Book; dc:creator ?creator; ?bp ?bo .
> >>  ?chapter a ex:Chapter; ?cp ?co .
> >> }
> >> WHERE {
> >>  ?lib a ex:Library; ex:contains ?book .
> >>  ?book a ex:Book; dc:creator ?creator; ?bp ?bo .
> >>  ?chapter a ex:Chapter; ?cp ?co .
> >> }
> >> 
> >> Or, directly to the Algebra:
> >> 
> >> (prefix
> >> (
> >>  (dc11: <http://purl.org/dc/elements/1.1/ <http://purl.org/dc/elements/1.1/>>)
> >>  (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>>)
> >>  (xsd: <http://www.w3.org/2001/XMLSchema# <http://www.w3.org/2001/XMLSchema#>>))
> >> (construct
> >>  (
> >>   (triple ?lib a ex:Library)
> >>   (triple ?lib ex:contains ?book)
> >>   (triple ?book a ex:Book)
> >>   (triple ?book dc:creator ?creator)
> >>   (triple ?book ?bp ?bo)
> >>   (triple ?chapter a ex:Chapter)
> >>   (triple ?chapter ?cp ?co))
> >>  (bgp
> >>   (triple ?lib a ex:Library)
> >>   (triple ?lib ex:contains ?book)
> >>   (triple ?book a ex:Book)
> >>   (triple ?book dc:creator ?creator)
> >>   (triple ?book ?bp ?bo)
> >>   (triple ?chapter a ex:Chapter)
> >>   (triple ?chapter ?cp ?co)) ))
> >> 
> >> Of course, in this case, the @construct and @where bits are symmetrical, and perhaps there’s a shortcut for this case, but in general, the
> >> @construct and @where are only related via variable bindings.
> >> 
> >> Gregg
> > 
> > Okay thanks, I get it now and I agree. I've updated json-rql to cope with CONSTRUCT (which was trivial), and my syntax looks like this:
> > 
> > {
> >            '@context' : {
> >                dc : 'http://purl.org/dc/elements/1.1/ <http://purl.org/dc/elements/1.1/>',
> >                rdf : 'http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>',
> >                xsd : 'http://www.w3.org/2001/XMLSchema# <http://www.w3.org/2001/XMLSchema#>',
> >                ex : 'http://example.com/ <http://example.com/>'
> 
> >            },
> >            queryType : 'CONSTRUCT',
> >            template : {
> >                '@id' : '?lib',
> >                '@type' : 'ex:Library',
> >                'ex:contains' : {
> >                    '@id' : '?book',
> >                    '@type' : 'ex:Book',
> >                    'dc:creator' : '?creator',
> >                    '?bp' : '?bo',
> >                    'ex:contains' : {
> >                        '@id' : '?chapter',
> >                        '@type' : 'ex:Chapter',
> >                        '?cp' : '?co'
> >                    }
> >                }
> >            },
> >            where : {
> >                '@id': '?lib',
> >                '@type': 'ex:Library',
> >                'ex:contains': {
> >                    '@id': '?book',
> >                    '@type': 'ex:Book',
> >                    'dc:creator': '?creator',
> >                    '?bp': '?bo',
> >                    'ex:contains': {
> >                        '@id': '?chapter',
> >                        '@type': 'ex:Chapter',
> >                        '?cp': '?co'
> >                    }
> >                }
> >            }
> > }
> > 
> > See https://github.com/gsvarovsky/json-rql/blob/master/test/constructExampleTest.js <https://github.com/gsvarovsky/json-rql/blob/master/test/constructExampleTest.js>
> > 
> > Of course, the 'template' clause is not really a JSON-LD Frame, it just translates to SPARQL in the way you suggest. However, an end-point that accepts json-rql could interpret the 'template' clause as a Frame and return the JSON-LD suitably framed.
> 
> We’re pretty much on the same page. I prefer that we use “@“-prefixed keywords for such internals as `template` and `where`, but they can always be aliased in the @context. In this case, the `queryType` might be inferred by the use of `@template` (or `@construct`). I don’t know if there are use cases here for ASK, DESCRIBE, or SELECT, given that the results are a dataset and not a result set (maybe DESCRIBE).
> 
> I would say that if there’s a template/construct, that it is used as the form of the output, to achieve framing. If there is no @construct, then the `@where` is superfluous, and the body of that can be used both for the query and for the result format/frame, which matches most current use.
> 
> We could also consider supporting existing `[]`, and `{}` selectors, as mapping to filter elements (@filter).
> 
> Other things to consider support for: FILTER, OPTIONAL, UNION
> 
> Perhaps not to be considered for a first pass: FROM, Property Paths, Negation/EXISTS, Group/Aggregation, Subqueries, Federation
> 
> Note that most of this is not required for GraphQL equivalence, so we might consider just the subset that allows support for GraphQL (which, note is not JSON, itself). Perhaps a simple transliteration from GraphQL to a JSON-LD framing/query form which provides this level of support at a minimum, with features added from SPARQL 1.1 as necessary.
> 
> Gregg
> 
> >>>> I’m certainly interested in hearing suggestions on other approaches, along with some use cases/examples.
> >>> 
> >>> [1] https://github.com/gsvarovsky/json-rql <https://github.com/gsvarovsky/json-rql>
>  <https://github.com/gsvarovsky/json-rql> 
> gsvarovsky/json-rql <https://github.com/gsvarovsky/json-rql>
> github.com <http://github.com/>
> json-rql - JSON RDF Query Language, a JSON-LD based SPARQL serialisation
> 
> 
> >>> [2] http://graphql.org/ <http://graphql.org/>
>  <http://graphql.org/> 
> GraphQL | A query language for your API <http://graphql.org/>
> graphql.org <http://graphql.org/>
> GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of ...
> 
> 
> >>> 
> >>> -----Original Message-----
> >>> From: Gregg Kellogg [mailto:gregg@greggkellogg.net <mailto:gregg@greggkellogg.net>]
> >>> Sent: 10 October 2016 23:51
> >>> To: George Svarovsky <gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com>>
> >>> Cc: Markus Lanthaler <markus.lanthaler@gmx.net <mailto:markus.lanthaler@gmx.net>>; Linked JSON
> >>> <public-linked-json@w3.org <mailto:public-linked-json@w3.org>>
> >>> Subject: Re: Reactivating the CG to work on updated versions of the
> >>> specs
> >>> 
> >>>> On Oct 10, 2016, at 2:32 AM, George Svarovsky <gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com>> wrote:
> >>>> 
> >>>> Hi Markus & Gregg & everyone
> >>> 
> >>> Hi George, glad to have you! Please consider joining the Community Group [1], which simplifies IP issues.
> >>> 
> >>>> I've worked with JSON-LD since 2013, for IDBS internal POC work, including prototype APIs and indexing in elasticsearch. I'd like to make
> >> it the lingua franca of our foundational APIs going forward. So although I'm not currently a 'heavy user', I'd like to become one! and I'd be
> >> very happy to be involved in the new wave of progress.
> >>>> 
> >>>> I have a particular interest in framing, and I concur with Markus that what I actually want is (some degree of) graph query. I have some
> >> thoughts, which I'll write out in a new thread.
> >>> 
> >>> I think there are several ways we could go:
> >>> 
> >>> 1) Improve framing based on the existing algorithms which provide some degree of manipulating and limiting the framed data based on
> >> existing relationships.
> >>> 2) Consider a way to include a variable syntax, and how this might be
> >>> used for both matching and constructing data
> >>> 3) Consider the implications of using SPARQL via de-serialization from JSON-LD to the RDF data model, performing a SPARQL query
> >> operation, and re-serializing back to JSON-LD and framing using some variation of the existing algorithms.
> >>> 
> >>> I’m certainly interested in hearing suggestions on other approaches, along with some use cases/examples.
> >>> 
> >>>> Otherwise do let me know the best way I can help…
> >>> 
> >>> Excellent.
> >>> 
> >>>> George
> >>>> 
> >>>> George Svarovsky | Technical Director | IDBS gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com> |
> >>>> www.idbs.com <http://www.idbs.com/> | @gsvarovsky
> >>> 
> >>> Gregg
> >>> 
> >>> [1] https://www.w3.org/community/json-ld/participants <https://www.w3.org/community/json-ld/participants>
> >>> 
> >>>> -----Original Message-----
> >>>> From: Markus Lanthaler [mailto:markus.lanthaler@gmx.net <mailto:markus.lanthaler@gmx.net>]
> >>>> Sent: 10 October 2016 09:55
> >>>> To: 'Linked JSON' <public-linked-json@w3.org <mailto:public-linked-json@w3.org>>
> >>>> Subject: RE: Reactivating the CG to work on updated versions of the
> >>>> specs
> >>>> 
> >>>> It is great to see you taking the initiative on this Gregg!
> >>>> 
> >>>> On 30 Sep 2016 at 11:31, Gregg Kellogg wrote:
> >>>>> JSON-LD 1.0 and JSON-LD API 1.0 have been out and successful for many years now.
> >>>>> JSON-LD has succeeded beyond the wildest dreams of the CG, thanks to broad adoption.
> >>>> 
> >>>> Indeed!
> >>>> 
> >>>> 
> >>>>> Additionally, the Framing algorithm [2] has proven to be important,
> >>>>> but work on the specification was never complete, and
> >>>>> implementations have moved beyond what was documented in any case.
> >>>> 
> >>>> It is certainly handy but I'm not sure there's agreement on what exactly it should be. Initially it was just (or at least mostly) about re-
> >> framing an existing graph... I think what a lot of people (myself included) actually want and need is to query a graph and control the
> >> serialization of the result. Maybe we should start with a discussion on the role of framing!?
> >>>> 
> >>>> 
> >>>>> I think it’s time to get back to these documents to create a future
> >>>>> 1.1 Community Group release of the specifications;
> >>>> 
> >>>> 1.1 sounds like minor tweaks to the existing official W3C specifications but some of the discussions and proposals I just saw go way
> >> beyond that. What do you consider to be in scope for 1.1?
> >>>> 
> >>>> 
> >>>>> At this point, I’d be happy to see active engagement on the mailing
> >>>>> list to move these issues forward; I’m prepared to do the heavy
> >>>>> lifting on the specification documents, and to maintain tests and my
> >>>>> own Ruby implementation to match. Hopefully, other implementors and
> >>>>> heavy users can actively engage in making this happen (perhaps an
> >>>>> hour a week). It may be that we’ll want to start up the bi-weekly calls we used to discuss and resolve on these issues prior to moving
> >> into the RDF WG.
> >>>> 
> >>>> I'd definitely like to help with this but unfortunately my spare cycles are quite limited.
> >>>> 
> >>>> 
> >>>> Cheers,
> >>>> Markus
> >>>> 
> >>>> 
> >>>> --
> >>>> Markus Lanthaler
> >>>> @markuslanthaler
> >>>> 
> >>>> 
> >>>> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you
> >> may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
> >>> 
> >>> 
> >>> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you
> >> may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
> > 
> > The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
> 
> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
Received on Sunday, 30 October 2016 15:45:57 UTC