- From: Jeen Broekstra <jeen.broekstra@gmail.com>
- Date: Thu, 19 Sep 2019 13:59:53 +1000
- To: Olaf Hartig <olaf.hartig@liu.se>
- Cc: "public-rdf-star@w3.org" <public-rdf-star@w3.org>
- Message-ID: <CANyF_kHU1RgB=OdgtUmAuC0P=82gFnoki5nOVakoE9r08cEVxw@mail.gmail.com>
Thanks for this clear summary Olaf, it's definitely helped me get back up to speed with the thrust of the discussion sofar. I don't think having these two separate modes baked into the language specification is a good idea. Apart from anything else, you would get problems in interoperability between tools. you would have to force all tooling to support both possible interpretations and in addition add a feature to all possible syntax carriers that makes it explicit which interpretation is meant for any given document - and even then there is lot of room for user error. Far simpler to stick to a single interpretation. From my perspective, SA seems the more logical choice here, as it doesn't rule out the alternative: using SA, you can express both intentions, with the only minor downside being that you have to write down the triple twice. But IMO that's hardly a massive problem, it's what compression algorithms are good for. Cheers, Jeen On Thu, Sep 19, 2019 at 4:55 AM Olaf Hartig <olaf.hartig@liu.se> wrote: > Dear all, > > There has been some confusion about the modes I have mentioned in some of > the > other threads (PG mode and SA mode). My aim with this email is to clear up > this confusion and to start a discussion of these modes. More precisely, > in > this email I will describe again what I mean by these modes and, > thereafter, > ask you to voice your opinions about the modes (whether you prefer one > over > the other, whether both should be possible to use or it would be better to > continue only with one of them, etc). In the following examples, to avoid > getting distracted or confused by details of some serialization syntax I > am > going to use the abstract syntax of the RDF* data model and the SPARQL* > language---i.e., nested triples and nested triple patterns---rather than a > concrete serialization format such as Turtle*. > > Now, the reason that made me come up with the idea of having two separate > modes in which RDF*/SPARQL* can be used is the following question: Given a > nested RDF* triple, say > > t = ( (s,p,o), p2, o2 ), > > is this nested triple meant to only make an assertion about the triple > t'=(s,p,o) that it contains (i.e., without asserting t') or is it meant to > additionally also assert t'? Since both of these options may have merits, > I > think it is worth considering which of them should be used for the RDF*/ > SPARQL* approach or, if possible, whether it is desirable to enable users > to > choose which one they use. So, let's have a more detailed look at the two > options. > > I start with the second option, which is what we may call "using > RDF*/SPARQL* > in Property Graph mode" (PG mode, for short). Again, when using this mode, > the > aforementioned nested RDF* triple t is meant both to make an assertion > about > the triple (s,p,o) and to also assert that triple, (s,p,o), at the same > time. > To make the implications of this option a bit more clear, consider the > following nested triple pattern (which may be used in the WHERE clause of > a > SPARQL* query): > > tp = ( s, p, ?v ) . > > When using RDF*/SPARQL* in PG mode, evaluating this triple pattern over > the > RDF* graph G that contains (only) the aforementioned nested RDF* triple t > results in a single solution* mapping m = {?v -> o} that maps variable ?v > to > o. > > At this point it may be important to emphasize that the papers that I have > published so far about the RDF*/SPARQL* approach assume this PG mode, but > without explicitly calling it like that. The reason for having made this > choice (rather than the alternative, which I now call SA mode as described > below) is that my initial perspective on RDF*/SPARQL* has been influenced > by > discussions with triplestore vendors who were interested in a practical, > reification-like feature to capture and to query statement-level > annotations. > The general intention was that this feature would be used in a way like > people > use the notion of edge properties in Property Graph databases (if you are > not > familiar with Property Graphs: an "edge property" is a key-value pair > associated with an edge in such a graph). Then, the implicit notion of > using > PG mode followed from this intention because, in a Property Graph, to > assign > edge properties to an edge, the edge must exist in the graph. > > Coming back now to the aforementioned two options regarding what the > nested > triple t could be meant to assert, there can be use cases for which the > first > option is more suitable than the second one. Adopting the first option is > what > we may call "using RDF*/SPARQL* in separate-assertions mode" (SA mode). To > reiterate, when using this mode, the aforementioned nested RDF* triple t > is > meant to only make an assertion about the triple t'=(s,p,o) without > asserting > t'. If we now look again at the aforementioned triple pattern tp and use > RDF*/ > SPARQL* in SA mode, evaluating tp over the same RDF* graph G as mentioned > before (G consists only of the nested triple t) results in no solution > mapping > at all---whereas in PG mode the result contains the aforementioned mapping > m = > {?v -> o}. > > The difference between PG mode and SA mode can also be observed if we > consider > how RDF* graphs may be converted into standard RDF graphs based on the RDF > reification vocabulary: If we assume RDF* is used in SA mode, the set > consisting of the following five RDF triples captures the information as > captured by our example RDF* triple t (where b is a fresh blank node): > > (b, rdf:type, rdf:Statement) > (b, rdf:subject, s) > (b, rdf:predicate, p) > (b, rdf:object, o) > (b, p2, o2) > > In contrast, if RDF* is used in PG mode, our example RDF* triple t would > have > to be converted into the following set of RDF triples, which contains one > additional triple (namely, the last one in the following list): > > (b, rdf:type, rdf:Statement) > (b, rdf:subject, s) > (b, rdf:predicate, p) > (b, rdf:object, o) > (b, p2, o2) > (s, p, o) > > I hope that these examples clarify now what I mean by using the > RDF*/SPARQL* > approach in PG mode versus using it in SA mode. > > My main aim of introducing the notion of these modes was to give names to > the > two possible options as a basis for a discussion about which of them > should be > selected for the specification of the RDF*/SPARQL* approach. > > Then, on top of that, and as a possible alternative to making the strict > decision of selecting only one of them for the spec, I thought it might be > useful to cover both of these options in the spec; essentially allowing > system/tool developers to decide which of them they support (and > indicating so > in their documentation) and, in turn, enabling users to employ the systems/ > tools that support the mode that is suitable for their use case. However, > baking such a choice into a specification might be a bad idea, and I am > interested in peoples' opinions about this. > > By the way, yet another alternative would be to combine the ideas of both > modes and enable users to make explicit for each nested triple whether > this > nested triple is meant to represent an annotation of an asserted triple > (like > in PG mode) or a statement about a non-asserted triple (like in SA mode). > While this alternative would render the notion of having separate modes > obsolete, it would require extending the notation in the abstract syntax > (and > then also in concrete user-focused syntaxes). So, perhaps we should not go > there at this point and, instead, first try to get a better understanding > of > the pros and cons of the two separate modes. > > Therefore, I would like to get your opinions on the following questions. > What do you think are the merits of the PG mode versus the SA mode, and > vice > versa? > Do you have a clear preference for one mode over the other? > What do you think about introducing both modes in a specification of the > RDF*/ > SPARQL* approach? > > Thanks, > Olaf > >
Received on Thursday, 19 September 2019 04:00:28 UTC