- From: Jem Rayfield <jem.rayfield@ontotext.com>
- Date: Fri, 20 Sep 2019 10:48:23 +0100
- To: Olaf Hartig <olaf.hartig@liu.se>
- Cc: "public-rdf-star@w3.org" <public-rdf-star@w3.org>, Jeen Broekstra <jeen.broekstra@gmail.com>
- Message-ID: <CAPPubhWiofZs97DR68RtZ8C5fXtaubaUz8ZgroKZRQBuYEbyog@mail.gmail.com>
Hello Olaf, I dont like the idea of modes. This should be closed off, imo. SA mode with a suitable syntax extension (something akin to Richards proposal) feels like the best route forward. GraphDb (Ontotext) will be working with Jeen (RDF4J) so I am pleased that his thoughts lean towards SA. Cheers Jem On Fri, Sep 20, 2019 at 10:09 AM Olaf Hartig <olaf.hartig@liu.se> wrote: > Hi, > > Jeen, good to know that the summary was clear ;-) > > I note that your response to the questions put up in my email goes in > the same direction as Pat's response. > > It would be great if we could also get some feedback from vendors who > have already implemented RDF*/SPARQL* or are considering to do so > (Bryan? Steve? Jem?) > > Thanks, > Olaf > > > On Thu, 2019-09-19 at 13:59 +1000, Jeen Broekstra wrote: > > Thanks for this clear summary Olaf, it's definitely helped me get > > back up to speed with the thrust of the discussion sofar. > > > > I don't think having these two separate modes baked into the language > > specification is a good idea. Apart from anything else, you would get > > problems in interoperability between tools. you would have to force > > all tooling to support both possible interpretations and in addition > > add a feature to all possible syntax carriers that makes it explicit > > which interpretation is meant for any given document - and even then > > there is lot of room for user error. > > > > Far simpler to stick to a single interpretation. From my perspective, > > SA seems the more logical choice here, as it doesn't rule out the > > alternative: using SA, you can express both intentions, with the only > > minor downside being that you have to write down the triple twice. > > But IMO that's hardly a massive problem, it's what compression > > algorithms are good for. > > > > Cheers, > > > > Jeen > > > > On Thu, Sep 19, 2019 at 4:55 AM Olaf Hartig <olaf.hartig@liu.se> > > wrote: > > > Dear all, > > > > > > There has been some confusion about the modes I have mentioned in > > > some of the > > > other threads (PG mode and SA mode). My aim with this email is to > > > clear up > > > this confusion and to start a discussion of these modes. More > > > precisely, in > > > this email I will describe again what I mean by these modes and, > > > thereafter, > > > ask you to voice your opinions about the modes (whether you prefer > > > one over > > > the other, whether both should be possible to use or it would be > > > better to > > > continue only with one of them, etc). In the following examples, to > > > avoid > > > getting distracted or confused by details of some serialization > > > syntax I am > > > going to use the abstract syntax of the RDF* data model and the > > > SPARQL* > > > language---i.e., nested triples and nested triple patterns---rather > > > than a > > > concrete serialization format such as Turtle*. > > > > > > Now, the reason that made me come up with the idea of having two > > > separate > > > modes in which RDF*/SPARQL* can be used is the following question: > > > Given a > > > nested RDF* triple, say > > > > > > t = ( (s,p,o), p2, o2 ), > > > > > > is this nested triple meant to only make an assertion about the > > > triple > > > t'=(s,p,o) that it contains (i.e., without asserting t') or is it > > > meant to > > > additionally also assert t'? Since both of these options may have > > > merits, I > > > think it is worth considering which of them should be used for the > > > RDF*/ > > > SPARQL* approach or, if possible, whether it is desirable to enable > > > users to > > > choose which one they use. So, let's have a more detailed look at > > > the two > > > options. > > > > > > I start with the second option, which is what we may call "using > > > RDF*/SPARQL* > > > in Property Graph mode" (PG mode, for short). Again, when using > > > this mode, the > > > aforementioned nested RDF* triple t is meant both to make an > > > assertion about > > > the triple (s,p,o) and to also assert that triple, (s,p,o), at the > > > same time. > > > To make the implications of this option a bit more clear, consider > > > the > > > following nested triple pattern (which may be used in the WHERE > > > clause of a > > > SPARQL* query): > > > > > > tp = ( s, p, ?v ) . > > > > > > When using RDF*/SPARQL* in PG mode, evaluating this triple pattern > > > over the > > > RDF* graph G that contains (only) the aforementioned nested RDF* > > > triple t > > > results in a single solution* mapping m = {?v -> o} that maps > > > variable ?v to > > > o. > > > > > > At this point it may be important to emphasize that the papers that > > > I have > > > published so far about the RDF*/SPARQL* approach assume this PG > > > mode, but > > > without explicitly calling it like that. The reason for having made > > > this > > > choice (rather than the alternative, which I now call SA mode as > > > described > > > below) is that my initial perspective on RDF*/SPARQL* has been > > > influenced by > > > discussions with triplestore vendors who were interested in a > > > practical, > > > reification-like feature to capture and to query statement-level > > > annotations. > > > The general intention was that this feature would be used in a way > > > like people > > > use the notion of edge properties in Property Graph databases (if > > > you are not > > > familiar with Property Graphs: an "edge property" is a key-value > > > pair > > > associated with an edge in such a graph). Then, the implicit notion > > > of using > > > PG mode followed from this intention because, in a Property Graph, > > > to assign > > > edge properties to an edge, the edge must exist in the graph. > > > > > > Coming back now to the aforementioned two options regarding what > > > the nested > > > triple t could be meant to assert, there can be use cases for which > > > the first > > > option is more suitable than the second one. Adopting the first > > > option is what > > > we may call "using RDF*/SPARQL* in separate-assertions mode" (SA > > > mode). To > > > reiterate, when using this mode, the aforementioned nested RDF* > > > triple t is > > > meant to only make an assertion about the triple t'=(s,p,o) without > > > asserting > > > t'. If we now look again at the aforementioned triple pattern tp > > > and use RDF*/ > > > SPARQL* in SA mode, evaluating tp over the same RDF* graph G as > > > mentioned > > > before (G consists only of the nested triple t) results in no > > > solution mapping > > > at all---whereas in PG mode the result contains the aforementioned > > > mapping m = > > > {?v -> o}. > > > > > > The difference between PG mode and SA mode can also be observed if > > > we consider > > > how RDF* graphs may be converted into standard RDF graphs based on > > > the RDF > > > reification vocabulary: If we assume RDF* is used in SA mode, the > > > set > > > consisting of the following five RDF triples captures the > > > information as > > > captured by our example RDF* triple t (where b is a fresh blank > > > node): > > > > > > (b, rdf:type, rdf:Statement) > > > (b, rdf:subject, s) > > > (b, rdf:predicate, p) > > > (b, rdf:object, o) > > > (b, p2, o2) > > > > > > In contrast, if RDF* is used in PG mode, our example RDF* triple t > > > would have > > > to be converted into the following set of RDF triples, which > > > contains one > > > additional triple (namely, the last one in the following list): > > > > > > (b, rdf:type, rdf:Statement) > > > (b, rdf:subject, s) > > > (b, rdf:predicate, p) > > > (b, rdf:object, o) > > > (b, p2, o2) > > > (s, p, o) > > > > > > I hope that these examples clarify now what I mean by using the > > > RDF*/SPARQL* > > > approach in PG mode versus using it in SA mode. > > > > > > My main aim of introducing the notion of these modes was to give > > > names to the > > > two possible options as a basis for a discussion about which of > > > them should be > > > selected for the specification of the RDF*/SPARQL* approach. > > > > > > Then, on top of that, and as a possible alternative to making the > > > strict > > > decision of selecting only one of them for the spec, I thought it > > > might be > > > useful to cover both of these options in the spec; essentially > > > allowing > > > system/tool developers to decide which of them they support (and > > > indicating so > > > in their documentation) and, in turn, enabling users to employ the > > > systems/ > > > tools that support the mode that is suitable for their use case. > > > However, > > > baking such a choice into a specification might be a bad idea, and > > > I am > > > interested in peoples' opinions about this. > > > > > > By the way, yet another alternative would be to combine the ideas > > > of both > > > modes and enable users to make explicit for each nested triple > > > whether this > > > nested triple is meant to represent an annotation of an asserted > > > triple (like > > > in PG mode) or a statement about a non-asserted triple (like in SA > > > mode). > > > While this alternative would render the notion of having separate > > > modes > > > obsolete, it would require extending the notation in the abstract > > > syntax (and > > > then also in concrete user-focused syntaxes). So, perhaps we should > > > not go > > > there at this point and, instead, first try to get a better > > > understanding of > > > the pros and cons of the two separate modes. > > > > > > Therefore, I would like to get your opinions on the following > > > questions. > > > What do you think are the merits of the PG mode versus the SA mode, > > > and vice > > > versa? > > > Do you have a clear preference for one mode over the other? > > > What do you think about introducing both modes in a specification > > > of the RDF*/ > > > SPARQL* approach? > > > > > > Thanks, > > > Olaf > > > > > -- Jem Rayfield Chief Architect Ontotext AD
Received on Friday, 20 September 2019 09:51:18 UTC