Re: Framing and Query from Gregg Kellogg on 2016-10-21 (public-linked-json@w3.org from October 2016)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Fri, 21 Oct 2016 14:33:34 +0200
To: James Anderson <james@dydra.com>
Cc: Linked JSON <public-linked-json@w3.org>
Message-Id: <695E3E30-3F4A-4832-AE30-D213B1AE83C8@greggkellogg.net>
> On Oct 21, 2016, at 2:11 PM, james anderson <james@dydra.com> wrote:
> 
> good morning;
> 
>> On 2016-10-20, at 23:22, james anderson <james@dydra.com <mailto:james@dydra.com>> wrote:
>> 
>> 
>>> On 2016-10-20, at 22:55, Gregg Kellogg <gregg@greggkellogg.net <mailto:gregg@greggkellogg.net>> wrote:
>>> 
>>> I added my thoughts on an alternative Framing dialect in Issue #433 [1]. I think we could probably support both existing flags and the proposed syntax, but given that Framing was never really published, I’m not sure that backwards compatibility is (or should be)  a consideration. I’m curious if this resonates with the community.
>> 
>> -1
> 
> please understand that as shorthand.
> a more explicit account adds nothing new, but i reiterate, none the less.
> 
> a graph framing mechanism controls
> - dominance relations between nodes
> - sequencing (intra-node)
> - identification : namespace, encoding, uniqueness, denotation scope ...
> - value encoding : optionality, representation, expansion, ...
> - in the presence of circularity (that is, for a graph v/s a tree), internode references
> 
> these issues are essential to framing.
> they are independent of the issues - selection, combination and projection which concern a query processor.
> in the context of a query processor they are at most incidental.
> 
> were there some particular combination of these concerns which yielded either expressions which were conceptually simpler or captured a process with greater capacity, or which described processes which were somehow significantly less complex to either implement or perform, then there would be reason to consider both concerns in a single language.
> 
> from this perspective, attention should be paid,
> first, to describe the framing process better than the current document,
> second, to resolve ambiguities, errors and inadequacies _with respect to framing_.
> 
> in the cases where the second goal indicates changes and/or additions to syntax, breaking changes should be avoided unless there is no alternative.
> 
> any effort beyond this scope, for example, to integrate query concerns, will yield architectural errors, consume time and resources, and fail to deliver significant advantage.

The current proposed update to framing can be found here: https://rawgit.com/json-ld/json-ld.org/issue-110-frame-matching/spec/latest/json-ld-framing/index.html <https://rawgit.com/json-ld/json-ld.org/issue-110-frame-matching/spec/latest/json-ld-framing/index.html>. This separate out the step of frame matching from the bulk of the framing algorithm which concerns itself with output generation, which is otherwise pretty much the algorithm that has existed for a couple of years.

My thought was, given that the framing spec was never really finished that there might be a window for revisiting the syntax, but perhaps not.

At a minimum, we need to clean up this document and describe what implementations actually do today (such as the playground). Beyond that, there are a number of open issues to consider for framing. Once things become more stable, it needs to be enhanced with a range of examples. Working examples are also in the test-suite: http://json-ld.org/test-suite/.

James, you had previously objected to the rather code-like description of the algorithm, and this version attempts to make it more general.

Specific proposals for how to improve the wording, or improve the algorithm are always appreciated.

Gregg

> best regards, from berlin,
> 
>> 
>> if one would like to combine framing with a query mechanism, please define a clear execution model which articulates those two and defines their interaction, but leaves the query language and its interpretation as a distinct independent component.
>> 
>> this can include either an api to be implemented by language frameworks or protocol extensions, or both.
>> 
>> please do not try to stipulate support for an integrated process.
>> 
>> best regards, from berlin,
>> 
>>> 
>>> Gregg Kellogg
>>> gregg@greggkellogg.net <mailto:gregg@greggkellogg.net>
>>> 
>>> [1] https://github.com/json-ld/json-ld.org/issues/433 <https://github.com/json-ld/json-ld.org/issues/433>
>>> 
>>>> On Oct 11, 2016, at 9:03 PM, Gregg Kellogg <gregg@greggkellogg.net <mailto:gregg@greggkellogg.net>> wrote:
>>>> 
>>>>> On Oct 11, 2016, at 3:02 AM, George Svarovsky <gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com>> wrote:
>>>>> 
>>>>> Hi Gregg, I'm glad to be here and I hope I can be of help.
>>>>> 
>>>>> I've taken the liberty of renaming this thread, and capturing the main recent salient points on this topic from the previous thread:
>>>>> 
>>>>> Gregg >>> Additionally, the Framing algorithm [2] has proven to be important, but work on the specification was never complete, and implementations  have moved beyond what was documented in any case.
>>>>> Markus >> It is certainly handy but I'm not sure there's agreement on what exactly it should be. Initially it was just (or at least mostly) about re-framing an existing graph... I think what a lot of people (myself included) actually want and need is to query a graph and control the serialization of the result. Maybe we should start with a discussion on the role of framing!?
>>>>> George >> I have a particular interest in framing, and I concur with Markus that what I actually want is (some degree of) graph query.
>>>>> Gregg > I know there has been some discussion on more sophisticated querying, but I’m not aware of any specific proposals. And, for my part, it seems to me that SPARQL Construct pretty much handles these use cases, other than for named graphs. It seems to me that trying to do something very significant could easily be a rat-hole, but it’s worth a discussion.
>>>>>> 
>>>>>> Another possibility I considered at one point was a JSON-LD based query specification language that would parse to the SPARQL Abstract Algebra (or simply generate SPARQL syntax), with triples derived from the JSON-LD used as the implicit dataset. This is probably more constrained, and leaves the messy query bits to a mature specification. This is significant enough, that it probably requires a specification separate from framing, and presumes that it’s the SPARQL syntax that is the issue being addressed.
>>>>> 
>>>>> The first internal POC I did with JSON-LD included a JSON query specification language, very closely related to a number of JSON query syntaxes such as MongoDB, FreeBase, Backbone-ORM and TaffyDB. In common with these it was deliberately limited in its capabilities, particularly for joins (ironically); but it was heavily invested in JSON-LD, effectively being a super-set with query operators. It was intended to be backed by our native Oracle schema, but it actually found more traction as an API to JSON-LD in elasticsearch.
>>>>> 
>>>>> I can go into more detail on that if there's interest. But in the meantime, earlier this year another POC led me to using an actual Triplestore for the first time, and I spent some happy hours fighting with constructing SPARQL in Node.js. Long story short, I ended up doing precisely what you (Gregg) just suggested :) I've shared it on GitHub and NPM [1].
>>>> 
>>>> The fact that the data model for JSON-LD is, in fact, RDF, makes SPARQL a natural choice for doing queries. Of course, other graph query algorithms could be adapted, but I suspect we’ll run into impedance issues, given that many of these are Property Graph based, not RDF graph. Also, SPARQL gives the opportunity to include Entailment Regimes as part of the solution space. I would probably tend to start with a more limited mapping to SPARQL Query, though.
>>>> 
>>>> Your JSON-RQL looks similar to what I was thinking, but I think we probably need separate @construct and @where sections, similar to how SPARQL CONSTRUCT works.
>>>> 
>>>> GraphQL also looks interesting, and could be a natural for JSON-LD based on its syntax. However, I’m concerned that as we go through it, we’ll find things that don’t match up as well given the RDF data model. But, there’s no reason that we would need to choose a single query mechanism, and perhaps there’s room for both GraphQL- and SPARQL-based approaches.
>>>> 
>>>>>> I think there are several ways we could go:
>>>>>> 
>>>>>> 1) Improve framing based on the existing algorithms which provide some degree of manipulating and limiting the framed data based on existing relationships.
>>>>>> 2) Consider a way to include a variable syntax, and how this might be used for both matching and constructing data
>>>>> 
>>>>> While I'm a fan of query-by-example, I think in the general case there's too much complexity in interlacing the Query (pattern-matching existing relationships), with the Frame (the structure I want to return). Personally, I've always ended up separating these concerns in the syntax. However, I think it does come down to how powerful you want your query language to be. GraphQL [2] happily combines the two into one tree, because its query syntax is very limited, deliberately. Trying to do the full power of SPARQL in this way would surely be messy. But these languages have different, almost non-overlapping, sweet-spots--one is for building application APIs, the other for database APIs.
>>>> 
>>>> Indeed.
>>>> 
>>>>>> 3) Consider the implications of using SPARQL via de-serialization from JSON-LD to the RDF data model, performing a SPARQL query operation, and re-serializing back to JSON-LD and framing using some variation of the existing algorithms.
>>>>> 
>>>>> I'm not sure what you mean here. Can you elaborate?
>>>> 
>>>> My though was to use SPARQL bouncing through RDF. Basically the following steps:
>>>> 
>>>> 1) Specify query in SPARQL, perhaps using a JSON-LD inspired syntactic variation mapping to the SPARQL Algebra.
>>>> 2) Turn the JSON-LD to be “framed” into RDF, and use as the dataset against which the SPARQL query (construct) is run.
>>>> 3) Serialize the constructed RDF using the format of the @construct clause hinted at above, to frame the results.
>>>> 
>>>> Just a wild shot at what this might look like:
>>>> {
>>>>  "@context": {
>>>>    "dc": "http://purl.org/dc/elements/1.1/ <http://purl.org/dc/elements/1.1/>",
>>>>    "ex": "http://example.org/vocab# <http://example.org/vocab#>"
>>>>  },
>>>>  "@construct": {
>>>>    "@id": "?lib",
>>>>    "@type": "ex:Library",
>>>>    "ex:contains": {
>>>>      "@id": "?book",
>>>>      "@type": "ex:Book",
>>>>      "dc:creator": "?creator",
>>>>      "?bp": "?bo",
>>>>      "ex:contains": {
>>>>        "@id": "?chapter",
>>>>        "@type": "ex:Chapter",
>>>>        "?cp": "?co"
>>>>      }
>>>>    }
>>>>  },
>>>>  "@where": {
>>>>    "@id": "?lib",
>>>>    "@type": "ex:Library",
>>>>    "ex:contains": {
>>>>      "@id": "?book",
>>>>      "@type": "ex:Book",
>>>>      "dc:creator": "?creator",
>>>>      "?bp": "?bo",
>>>>      "ex:contains": {
>>>>        "@id": "?chapter",
>>>>        "@type": "ex:Chapter",
>>>>        "?cp": "?co"
>>>>      }
>>>>    }
>>>>  }
>>>> }
>>>> 
>>>> 
>>>> The @construct part forms a frame, where objects are repeated as necessary based on subject matches. This roughly would translate to the following SPARQL Query:
>>>> 
>>>> PREFIX dc11: <http://purl.org/dc/elements/1.1/ <http://purl.org/dc/elements/1.1/>>
>>>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>>
>>>> PREFIX xsd: <http://www.w3.org/2001/XMLSchema# <http://www.w3.org/2001/XMLSchema#>>
>>>> 
>>>> CONSTRUCT {
>>>>  ?lib a ex:Library; ex:contains ?book .
>>>>  ?book a ex:Book; dc:creator ?creator; ?bp ?bo .
>>>>  ?chapter a ex:Chapter; ?cp ?co .
>>>> }
>>>> WHERE {
>>>>  ?lib a ex:Library; ex:contains ?book .
>>>>  ?book a ex:Book; dc:creator ?creator; ?bp ?bo .
>>>>  ?chapter a ex:Chapter; ?cp ?co .
>>>> }
>>>> 
>>>> Or, directly to the Algebra:
>>>> 
>>>> (prefix
>>>> (
>>>>  (dc11: <http://purl.org/dc/elements/1.1/ <http://purl.org/dc/elements/1.1/>>)
>>>>  (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns# <http://www.w3.org/1999/02/22-rdf-syntax-ns#>>)
>>>>  (xsd: <http://www.w3.org/2001/XMLSchema# <http://www.w3.org/2001/XMLSchema#>>))
>>>> (construct
>>>>  (
>>>>   (triple ?lib a ex:Library)
>>>>   (triple ?lib ex:contains ?book)
>>>>   (triple ?book a ex:Book)
>>>>   (triple ?book dc:creator ?creator)
>>>>   (triple ?book ?bp ?bo)
>>>>   (triple ?chapter a ex:Chapter)
>>>>   (triple ?chapter ?cp ?co))
>>>>  (bgp
>>>>   (triple ?lib a ex:Library)
>>>>   (triple ?lib ex:contains ?book)
>>>>   (triple ?book a ex:Book)
>>>>   (triple ?book dc:creator ?creator)
>>>>   (triple ?book ?bp ?bo)
>>>>   (triple ?chapter a ex:Chapter)
>>>>   (triple ?chapter ?cp ?co)) ))
>>>> 
>>>> Of course, in this case, the @construct and @where bits are symmetrical, and perhaps there’s a shortcut for this case, but in general, the @construct and @where are only related via variable bindings.
>>>> 
>>>> Gregg
>>>> 
>>>>>> I’m certainly interested in hearing suggestions on other approaches, along with some use cases/examples.
>>>>> 
>>>>> [1] https://github.com/gsvarovsky/json-rql <https://github.com/gsvarovsky/json-rql>
>>>>> [2] http://graphql.org/ <http://graphql.org/>
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Gregg Kellogg [mailto:gregg@greggkellogg.net <mailto:gregg@greggkellogg.net>]
>>>>> Sent: 10 October 2016 23:51
>>>>> To: George Svarovsky <gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com>>
>>>>> Cc: Markus Lanthaler <markus.lanthaler@gmx.net <mailto:markus.lanthaler@gmx.net>>; Linked JSON <public-linked-json@w3.org <mailto:public-linked-json@w3.org>>
>>>>> Subject: Re: Reactivating the CG to work on updated versions of the specs
>>>>> 
>>>>>> On Oct 10, 2016, at 2:32 AM, George Svarovsky <gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com>> wrote:
>>>>>> 
>>>>>> Hi Markus & Gregg & everyone
>>>>> 
>>>>> Hi George, glad to have you! Please consider joining the Community Group [1], which simplifies IP issues.
>>>>> 
>>>>>> I've worked with JSON-LD since 2013, for IDBS internal POC work, including prototype APIs and indexing in elasticsearch. I'd like to make it the lingua franca of our foundational APIs going forward. So although I'm not currently a 'heavy user', I'd like to become one! and I'd be very happy to be involved in the new wave of progress.
>>>>>> 
>>>>>> I have a particular interest in framing, and I concur with Markus that what I actually want is (some degree of) graph query. I have some thoughts, which I'll write out in a new thread.
>>>>> 
>>>>> I think there are several ways we could go:
>>>>> 
>>>>> 1) Improve framing based on the existing algorithms which provide some degree of manipulating and limiting the framed data based on existing relationships.
>>>>> 2) Consider a way to include a variable syntax, and how this might be used for both matching and constructing data
>>>>> 3) Consider the implications of using SPARQL via de-serialization from JSON-LD to the RDF data model, performing a SPARQL query operation, and re-serializing back to JSON-LD and framing using some variation of the existing algorithms.
>>>>> 
>>>>> I’m certainly interested in hearing suggestions on other approaches, along with some use cases/examples.
>>>>> 
>>>>>> Otherwise do let me know the best way I can help…
>>>>> 
>>>>> Excellent.
>>>>> 
>>>>>> George
>>>>>> 
>>>>>> George Svarovsky | Technical Director | IDBS gsvarovsky@idbs.com <mailto:gsvarovsky@idbs.com> |
>>>>>> www.idbs.com <http://www.idbs.com/> | @gsvarovsky
>>>>> 
>>>>> Gregg
>>>>> 
>>>>> [1] https://www.w3.org/community/json-ld/participants <https://www.w3.org/community/json-ld/participants>
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Markus Lanthaler [mailto:markus.lanthaler@gmx.net <mailto:markus.lanthaler@gmx.net>]
>>>>>> Sent: 10 October 2016 09:55
>>>>>> To: 'Linked JSON' <public-linked-json@w3.org <mailto:public-linked-json@w3.org>>
>>>>>> Subject: RE: Reactivating the CG to work on updated versions of the
>>>>>> specs
>>>>>> 
>>>>>> It is great to see you taking the initiative on this Gregg!
>>>>>> 
>>>>>> On 30 Sep 2016 at 11:31, Gregg Kellogg wrote:
>>>>>>> JSON-LD 1.0 and JSON-LD API 1.0 have been out and successful for many years now.
>>>>>>> JSON-LD has succeeded beyond the wildest dreams of the CG, thanks to broad adoption.
>>>>>> 
>>>>>> Indeed!
>>>>>> 
>>>>>> 
>>>>>>> Additionally, the Framing algorithm [2] has proven to be important,
>>>>>>> but work on the specification was never complete, and implementations
>>>>>>> have moved beyond what was documented in any case.
>>>>>> 
>>>>>> It is certainly handy but I'm not sure there's agreement on what exactly it should be. Initially it was just (or at least mostly) about re-framing an existing graph... I think what a lot of people (myself included) actually want and need is to query a graph and control the serialization of the result. Maybe we should start with a discussion on the role of framing!?
>>>>>> 
>>>>>> 
>>>>>>> I think it’s time to get back to these documents to create a future
>>>>>>> 1.1 Community Group release of the specifications;
>>>>>> 
>>>>>> 1.1 sounds like minor tweaks to the existing official W3C specifications but some of the discussions and proposals I just saw go way beyond that. What do you consider to be in scope for 1.1?
>>>>>> 
>>>>>> 
>>>>>>> At this point, I’d be happy to see active engagement on the mailing
>>>>>>> list to move these issues forward; I’m prepared to do the heavy
>>>>>>> lifting on the specification documents, and to maintain tests and my
>>>>>>> own Ruby implementation to match. Hopefully, other implementors and
>>>>>>> heavy users can actively engage in making this happen (perhaps an
>>>>>>> hour a week). It may be that we’ll want to start up the bi-weekly calls we used to discuss and resolve on these issues prior to moving into the RDF WG.
>>>>>> 
>>>>>> I'd definitely like to help with this but unfortunately my spare cycles are quite limited.
>>>>>> 
>>>>>> 
>>>>>> Cheers,
>>>>>> Markus
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Markus Lanthaler
>>>>>> @markuslanthaler
>>>>>> 
>>>>>> 
>>>>>> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
>>>>> 
>>>>> 
>>>>> The content of this e-mail, including any attachments, is confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
>>> 
>> 
>> ---
>> james anderson | james@dydra.com <mailto:james@dydra.com> | http://dydra.com <http://dydra.com/>
>> 
>> 
>> 
>> 
>> 
> 
> ---
> james anderson | james@dydra.com <mailto:james@dydra.com> | http://dydra.com <http://dydra.com/>
Received on Friday, 21 October 2016 12:34:14 UTC