RE: Framing and Query from George Svarovsky on 2016-10-12 (public-linked-json@w3.org from October 2016)

From: George Svarovsky <gsvarovsky@idbs.com>
Date: Wed, 12 Oct 2016 10:39:55 +0000
To: Gregg Kellogg <gregg@greggkellogg.net>
CC: Linked JSON <public-linked-json@w3.org>
Message-ID: <AM3PR06MB0946307D57E70CE54C2D4750A0DD0@AM3PR06MB0946.eurprd06.prod.outlook.com>
> > On Oct 11, 2016, at 3:02 AM, George Svarovsky <gsvarovsky@idbs.com> wrote:
> >
> > Hi Gregg, I'm glad to be here and I hope I can be of help.
> >
> > I've taken the liberty of renaming this thread, and capturing the main recent salient points on this topic from the previous thread:
> >
> > Gregg >>> Additionally, the Framing algorithm [2] has proven to be important, but work on the specification was never complete, and
> implementations  have moved beyond what was documented in any case.
> > Markus >> It is certainly handy but I'm not sure there's agreement on what exactly it should be. Initially it was just (or at least mostly)
> about re-framing an existing graph... I think what a lot of people (myself included) actually want and need is to query a graph and control
> the serialization of the result. Maybe we should start with a discussion on the role of framing!?
> > George >> I have a particular interest in framing, and I concur with Markus that what I actually want is (some degree of) graph query.
> > Gregg > I know there has been some discussion on more sophisticated querying, but I’m not aware of any specific proposals. And, for my
> part, it seems to me that SPARQL Construct pretty much handles these use cases, other than for named graphs. It seems to me that trying
> to do something very significant could easily be a rat-hole, but it’s worth a discussion.
> >>
> >> Another possibility I considered at one point was a JSON-LD based query specification language that would parse to the SPARQL Abstract
> Algebra (or simply generate SPARQL syntax), with triples derived from the JSON-LD used as the implicit dataset. This is probably more
> constrained, and leaves the messy query bits to a mature specification. This is significant enough, that it probably requires a specification
> separate from framing, and presumes that it’s the SPARQL syntax that is the issue being addressed.
> >
> > The first internal POC I did with JSON-LD included a JSON query specification language, very closely related to a number of JSON query
> syntaxes such as MongoDB, FreeBase, Backbone-ORM and TaffyDB. In common with these it was deliberately limited in its capabilities,
> particularly for joins (ironically); but it was heavily invested in JSON-LD, effectively being a super-set with query operators. It was intended
> to be backed by our native Oracle schema, but it actually found more traction as an API to JSON-LD in elasticsearch.
> >
> > I can go into more detail on that if there's interest. But in the meantime, earlier this year another POC led me to using an actual
> Triplestore for the first time, and I spent some happy hours fighting with constructing SPARQL in Node.js. Long story short, I ended up doing
> precisely what you (Gregg) just suggested :) I've shared it on GitHub and NPM [1].
>
> The fact that the data model for JSON-LD is, in fact, RDF, makes SPARQL a natural choice for doing queries. Of course, other graph query
> algorithms could be adapted, but I suspect we’ll run into impedance issues, given that many of these are Property Graph based, not RDF
> graph. Also, SPARQL gives the opportunity to include Entailment Regimes as part of the solution space. I would probably tend to start with a
> more limited mapping to SPARQL Query, though.
>
> Your JSON-RQL looks similar to what I was thinking, but I think we probably need separate @construct and @where sections, similar to how
> SPARQL CONSTRUCT works.

Just to be clear (and I'm not remotely trying to sell it), json-rql just replaces the triple representation of SPARQL.js, which is just a JSON representation of the SPARQL 1.1 AST. So it's got WHERE and CONSTRUCT at the top level. So yes, I agree, and I think a fully baked json-rql would still basically be what you're suggesting. See below.

> GraphQL also looks interesting, and could be a natural for JSON-LD based on its syntax. However, I’m concerned that as we go through it,
> we’ll find things that don’t match up as well given the RDF data model. But, there’s no reason that we would need to choose a single query
> mechanism, and perhaps there’s room for both GraphQL- and SPARQL-based approaches.
> >> I think there are several ways we could go:
> >>
> >> 1) Improve framing based on the existing algorithms which provide some degree of manipulating and limiting the framed data based on
> existing relationships.
> >> 2) Consider a way to include a variable syntax, and how this might be
> >> used for both matching and constructing data
> >
> > While I'm a fan of query-by-example, I think in the general case there's too much complexity in interlacing the Query (pattern-matching
> existing relationships), with the Frame (the structure I want to return). Personally, I've always ended up separating these concerns in the
> syntax. However, I think it does come down to how powerful you want your query language to be. GraphQL [2] happily combines the two
> into one tree, because its query syntax is very limited, deliberately. Trying to do the full power of SPARQL in this way would surely be messy.
> But these languages have different, almost non-overlapping, sweet-spots--one is for building application APIs, the other for database APIs.
>
> Indeed.
>
> >> 3) Consider the implications of using SPARQL via de-serialization from JSON-LD to the RDF data model, performing a SPARQL query
> operation, and re-serializing back to JSON-LD and framing using some variation of the existing algorithms.
> >
> > I'm not sure what you mean here. Can you elaborate?
>
> My though was to use SPARQL bouncing through RDF. Basically the following steps:
>
> 1) Specify query in SPARQL, perhaps using a JSON-LD inspired syntactic variation mapping to the SPARQL Algebra.
> 2) Turn the JSON-LD to be “framed” into RDF, and use as the dataset against which the SPARQL query (construct) is run.
> 3) Serialize the constructed RDF using the format of the @construct clause hinted at above, to frame the results.
>
> Just a wild shot at what this might look like:
> {
>   "@context": {
>     "dc": "http://purl.org/dc/elements/1.1/",
>     "ex": "http://example.org/vocab#"
>   },
>   "@construct": {
>     "@id": "?lib",
>     "@type": "ex:Library",
>     "ex:contains": {
>       "@id": "?book",
>       "@type": "ex:Book",
>       "dc:creator": "?creator",
>       "?bp": "?bo",
>       "ex:contains": {
>         "@id": "?chapter",
>         "@type": "ex:Chapter",
>         "?cp": "?co"
>       }
>     }
>   },
>   "@where": {
>     "@id": "?lib",
>     "@type": "ex:Library",
>     "ex:contains": {
>       "@id": "?book",
>       "@type": "ex:Book",
>       "dc:creator": "?creator",
>       "?bp": "?bo",
>       "ex:contains": {
>         "@id": "?chapter",
>         "@type": "ex:Chapter",
>         "?cp": "?co"
>       }
>     }
>   }
> }
>
>
> The @construct part forms a frame, where objects are repeated as necessary based on subject matches. This roughly would translate to the
> following SPARQL Query:
>
> PREFIX dc11: <http://purl.org/dc/elements/1.1/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
>
> CONSTRUCT {
>   ?lib a ex:Library; ex:contains ?book .
>   ?book a ex:Book; dc:creator ?creator; ?bp ?bo .
>   ?chapter a ex:Chapter; ?cp ?co .
> }
> WHERE {
>   ?lib a ex:Library; ex:contains ?book .
>   ?book a ex:Book; dc:creator ?creator; ?bp ?bo .
>   ?chapter a ex:Chapter; ?cp ?co .
> }
>
> Or, directly to the Algebra:
>
> (prefix
>  (
>   (dc11: <http://purl.org/dc/elements/1.1/>)
>   (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
>   (xsd: <http://www.w3.org/2001/XMLSchema#>))
>  (construct
>   (
>    (triple ?lib a ex:Library)
>    (triple ?lib ex:contains ?book)
>    (triple ?book a ex:Book)
>    (triple ?book dc:creator ?creator)
>    (triple ?book ?bp ?bo)
>    (triple ?chapter a ex:Chapter)
>    (triple ?chapter ?cp ?co))
>   (bgp
>    (triple ?lib a ex:Library)
>    (triple ?lib ex:contains ?book)
>    (triple ?book a ex:Book)
>    (triple ?book dc:creator ?creator)
>    (triple ?book ?bp ?bo)
>    (triple ?chapter a ex:Chapter)
>    (triple ?chapter ?cp ?co)) ))
>
> Of course, in this case, the @construct and @where bits are symmetrical, and perhaps there’s a shortcut for this case, but in general, the
> @construct and @where are only related via variable bindings.
>
> Gregg

Okay thanks, I get it now and I agree. I've updated json-rql to cope with CONSTRUCT (which was trivial), and my syntax looks like this:

{
            '@context' : {
                dc : 'http://purl.org/dc/elements/1.1/',
                rdf : 'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
                xsd : 'http://www.w3.org/2001/XMLSchema#',
                ex : 'http://example.com/'
            },
            queryType : 'CONSTRUCT',
            template : {
                '@id' : '?lib',
                '@type' : 'ex:Library',
                'ex:contains' : {
                    '@id' : '?book',
                    '@type' : 'ex:Book',
                    'dc:creator' : '?creator',
                    '?bp' : '?bo',
                    'ex:contains' : {
                        '@id' : '?chapter',
                        '@type' : 'ex:Chapter',
                        '?cp' : '?co'
                    }
                }
            },
            where : {
                '@id': '?lib',
                '@type': 'ex:Library',
                'ex:contains': {
                    '@id': '?book',
                    '@type': 'ex:Book',
                    'dc:creator': '?creator',
                    '?bp': '?bo',
                    'ex:contains': {
                        '@id': '?chapter',
                        '@type': 'ex:Chapter',
                        '?cp': '?co'
                    }
                }
            }
}

See https://github.com/gsvarovsky/json-rql/blob/master/test/constructExampleTest.js


Of course, the 'template' clause is not really a JSON-LD Frame, it just translates to SPARQL in the way you suggest. However, an end-point that accepts json-rql could interpret the 'template' clause as a Frame and return the JSON-LD suitably framed.

>
> >> I’m certainly interested in hearing suggestions on other approaches, along with some use cases/examples.
> >
> > [1] https://github.com/gsvarovsky/json-rql

> > [2] http://graphql.org/

> >
> > -----Original Message-----
> > From: Gregg Kellogg [mailto:gregg@greggkellogg.net]
> > Sent: 10 October 2016 23:51
> > To: George Svarovsky <gsvarovsky@idbs.com>
> > Cc: Markus Lanthaler <markus.lanthaler@gmx.net>; Linked JSON
> > <public-linked-json@w3.org>
> > Subject: Re: Reactivating the CG to work on updated versions of the
> > specs
> >
> >> On Oct 10, 2016, at 2:32 AM, George Svarovsky <gsvarovsky@idbs.com> wrote:
> >>
> >> Hi Markus & Gregg & everyone
> >
> > Hi George, glad to have you! Please consider joining the Community Group [1], which simplifies IP issues.
> >
> >> I've worked with JSON-LD since 2013, for IDBS internal POC work, including prototype APIs and indexing in elasticsearch. I'd like to make
> it the lingua franca of our foundational APIs going forward. So although I'm not currently a 'heavy user', I'd like to become one! and I'd be
> very happy to be involved in the new wave of progress.
> >>
> >> I have a particular interest in framing, and I concur with Markus that what I actually want is (some degree of) graph query. I have some
> thoughts, which I'll write out in a new thread.
> >
> > I think there are several ways we could go:
> >
> > 1) Improve framing based on the existing algorithms which provide some degree of manipulating and limiting the framed data based on
> existing relationships.
> > 2) Consider a way to include a variable syntax, and how this might be
> > used for both matching and constructing data
> > 3) Consider the implications of using SPARQL via de-serialization from JSON-LD to the RDF data model, performing a SPARQL query
> operation, and re-serializing back to JSON-LD and framing using some variation of the existing algorithms.
> >
> > I’m certainly interested in hearing suggestions on other approaches, along with some use cases/examples.
> >
> >> Otherwise do let me know the best way I can help…
> >
> > Excellent.
> >
> >> George
> >>
> >> George Svarovsky | Technical Director | IDBS gsvarovsky@idbs.com |
> >> www.idbs.com | @gsvarovsky
> >
> > Gregg
> >
> > [1] https://www.w3.org/community/json-ld/participants

> >
> >> -----Original Message-----
> >> From: Markus Lanthaler [mailto:markus.lanthaler@gmx.net]
> >> Sent: 10 October 2016 09:55
> >> To: 'Linked JSON' <public-linked-json@w3.org>
> >> Subject: RE: Reactivating the CG to work on updated versions of the
> >> specs
> >>
> >> It is great to see you taking the initiative on this Gregg!
> >>
> >> On 30 Sep 2016 at 11:31, Gregg Kellogg wrote:
> >>> JSON-LD 1.0 and JSON-LD API 1.0 have been out and successful for many years now.
> >>> JSON-LD has succeeded beyond the wildest dreams of the CG, thanks to broad adoption.
> >>
> >> Indeed!
> >>
> >>
> >>> Additionally, the Framing algorithm [2] has proven to be important,
> >>> but work on the specification was never complete, and
> >>> implementations have moved beyond what was documented in any case.
> >>
> >> It is certainly handy but I'm not sure there's agreement on what exactly it should be. Initially it was just (or at least mostly) about re-
> framing an existing graph... I think what a lot of people (myself included) actually want and need is to query a graph and control the
> serialization of the result. Maybe we should start with a discussion on the role of framing!?
> >>
> >>
> >>> I think it’s time to get back to these documents to create a future
> >>> 1.1 Community Group release of the specifications;
> >>
> >> 1.1 sounds like minor tweaks to the existing official W3C specifications but some of the discussions and proposals I just saw go way
> beyond that. What do you consider to be in scope for 1.1?
> >>
> >>
> >>> At this point, I’d be happy to see active engagement on the mailing
> >>> list to move these issues forward; I’m prepared to do the heavy
> >>> lifting on the specification documents, and to maintain tests and my
> >>> own Ruby implementation to match. Hopefully, other implementors and
> >>> heavy users can actively engage in making this happen (perhaps an
> >>> hour a week). It may be that we’ll want to start up the bi-weekly calls we used to discuss and resolve on these issues prior to moving
> into the RDF WG.
> >>
> >> I'd definitely like to help with this but unfortunately my spare cycles are quite limited.
> >>
> >>
> >> Cheers,
> >> Markus
> >>
> >>
> >> --
> >> Markus Lanthaler
> >> @markuslanthaler
> >>
> >>
> >> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you
> may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
> >
> >
> > The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you
> may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.

The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
Received on Wednesday, 12 October 2016 10:40:26 UTC