Re: PG mode and SA mode from Olaf Hartig on 2019-09-20 (public-rdf-star@w3.org from September 2019)

From: Olaf Hartig <olaf.hartig@liu.se>
Date: Fri, 20 Sep 2019 09:16:12 +0000
To: Richard Cyganiak <richard@cyganiak.de>
CC: "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Message-ID: <860296b3e85d878dc313411913150375ab8e0522.camel@liu.se>
Richard,

Thanks for your input, which also is in line with Pat's and Jeen's
feedback.

The syntax that you are sketching is interesting as well!
(in particular, because it is not breaking the way Turtle* already
looks at the moment)

Olaf


On Thu, 2019-09-19 at 16:28 +0100, Richard Cyganiak wrote:
> Olaf,
> 
> Thank you for this summary.
> 
> Having two modes is not a practical solution in my opinion. How would
> interoperability be achieved if a receiver of a document or query
> does not know which mode is intended? The mode would always have to
> be communicated, either in the document or query itself, or in every
> protocol used to send or receive these artefacts. If the mode marker
> becomes detached or is overlooked, communication will fail. If some
> tool vendors decide to only support one mode, communication will
> fail. It is not a good option.
> 
> So I will consider PG and SA as different options for what the
> RDF*/Turtle*/SPARQL* languages could be once they are fully
> specified, rather than as different modes that implementations of
> these languages could operate in.
> 
> The SA option appeals more to me than PG option, assuming a suitable
> syntax.
> 
> A very quick sketch for such a syntax:
> 
>     # assert a triple -- basic Turtle -- this produces one triple
>     :s :p :o.
> 
>     # annotate the triple :s :p :o without asserting it -- this
> produces one triple
>     <<:s :p :o>> :a :b.
> 
>     # assert a triple and annotate it -- this produces two triples
>     :s :p :o [[ :a :b ]].
> 
> Best,
> Richard
> 
> 
> > On 18 Sep 2019, at 19:53, Olaf Hartig <olaf.hartig@liu.se> wrote:
> > 
> > Dear all,
> > 
> > There has been some confusion about the modes I have mentioned in
> > some of the 
> > other threads (PG mode and SA mode). My aim with this email is to
> > clear up 
> > this confusion and to start a discussion of these modes. More
> > precisely, in 
> > this email I will describe again what I mean by these modes and,
> > thereafter, 
> > ask you to voice your opinions about the modes (whether you prefer
> > one over 
> > the other, whether both should be possible to use or it would be
> > better to 
> > continue only with one of them, etc). In the following examples, to
> > avoid 
> > getting distracted or confused by details of some serialization
> > syntax I am 
> > going to use the abstract syntax of the RDF* data model and the
> > SPARQL* 
> > language---i.e., nested triples and nested triple patterns---rather 
> > than a 
> > concrete serialization format such as Turtle*.
> > 
> > Now, the reason that made me come up with the idea of having two
> > separate 
> > modes in which RDF*/SPARQL* can be used is the following question:
> > Given a 
> > nested RDF* triple, say
> > 
> > t = ( (s,p,o), p2, o2 ),
> > 
> > is this nested triple meant to only make an assertion about the
> > triple 
> > t'=(s,p,o) that it contains (i.e., without asserting t') or is it
> > meant to 
> > additionally also assert t'? Since both of these options may have
> > merits, I 
> > think it is worth considering which of them should be used for the
> > RDF*/
> > SPARQL* approach or, if possible, whether it is desirable to enable
> > users to 
> > choose which one they use. So, let's have a more detailed look at
> > the two 
> > options.
> > 
> > I start with the second option, which is what we may call "using
> > RDF*/SPARQL* 
> > in Property Graph mode" (PG mode, for short). Again, when using
> > this mode, the 
> > aforementioned nested RDF* triple t is meant both to make an
> > assertion about 
> > the triple (s,p,o) and to also assert that triple, (s,p,o), at the
> > same time. 
> > To make the implications of this option a bit more clear, consider
> > the 
> > following nested triple pattern (which may be used in the WHERE
> > clause of a 
> > SPARQL* query):
> > 
> > tp = ( s, p, ?v ) .
> > 
> > When using RDF*/SPARQL* in PG mode, evaluating this triple pattern
> > over the 
> > RDF* graph G that contains (only) the aforementioned nested RDF*
> > triple t 
> > results in a single solution* mapping m = {?v -> o} that maps
> > variable ?v to 
> > o.
> > 
> > At this point it may be important to emphasize that the papers that
> > I have 
> > published so far about the RDF*/SPARQL* approach assume this PG
> > mode, but 
> > without explicitly calling it like that. The reason for having made
> > this 
> > choice (rather than the alternative, which I now call SA mode as
> > described 
> > below) is that my initial perspective on RDF*/SPARQL* has been
> > influenced by 
> > discussions with triplestore vendors who were interested in a
> > practical, 
> > reification-like feature to capture and to query statement-level
> > annotations. 
> > The general intention was that this feature would be used in a way
> > like people 
> > use the notion of edge properties in Property Graph databases (if
> > you are not 
> > familiar with Property Graphs: an "edge property" is a key-value
> > pair 
> > associated with an edge in such a graph). Then, the implicit notion
> > of using 
> > PG mode followed from this intention because, in a Property Graph,
> > to assign 
> > edge properties to an edge, the edge must exist in the graph.
> > 
> > Coming back now to the aforementioned two options regarding what
> > the nested 
> > triple t could be meant to assert, there can be use cases for which
> > the first 
> > option is more suitable than the second one. Adopting the first
> > option is what 
> > we may call "using RDF*/SPARQL* in separate-assertions mode" (SA
> > mode). To 
> > reiterate, when using this mode, the aforementioned nested RDF*
> > triple t is 
> > meant to only make an assertion about the triple t'=(s,p,o) without
> > asserting 
> > t'. If we now look again at the aforementioned triple pattern tp
> > and use RDF*/
> > SPARQL* in SA mode, evaluating tp over the same RDF* graph G as
> > mentioned 
> > before (G consists only of the nested triple t) results in no
> > solution mapping 
> > at all---whereas in PG mode the result contains the aforementioned
> > mapping m = 
> > {?v -> o}.
> > 
> > The difference between PG mode and SA mode can also be observed if
> > we consider 
> > how RDF* graphs may be converted into standard RDF graphs based on
> > the RDF 
> > reification vocabulary: If we assume RDF* is used in SA mode, the
> > set 
> > consisting of the following five RDF triples captures the
> > information as 
> > captured by our example RDF* triple t (where b is a fresh blank
> > node):
> > 
> > (b, rdf:type, rdf:Statement)
> > (b, rdf:subject, s)
> > (b, rdf:predicate, p)
> > (b, rdf:object, o)
> > (b, p2, o2)
> > 
> > In contrast, if RDF* is used in PG mode, our example RDF* triple t
> > would have 
> > to be converted into the following set of RDF triples, which
> > contains one 
> > additional triple (namely, the last one in the following list):
> > 
> > (b, rdf:type, rdf:Statement)
> > (b, rdf:subject, s)
> > (b, rdf:predicate, p)
> > (b, rdf:object, o)
> > (b, p2, o2)
> > (s, p, o)
> > 
> > I hope that these examples clarify now what I mean by using the
> > RDF*/SPARQL* 
> > approach in PG mode versus using it in SA mode.
> > 
> > My main aim of introducing the notion of these modes was to give
> > names to the 
> > two possible options as a basis for a discussion about which of
> > them should be 
> > selected for the specification of the RDF*/SPARQL* approach.
> > 
> > Then, on top of that, and as a possible alternative to making the
> > strict 
> > decision of selecting only one of them for the spec, I thought it
> > might be 
> > useful to cover both of these options in the spec; essentially
> > allowing 
> > system/tool developers to decide which of them they support (and
> > indicating so 
> > in their documentation) and, in turn, enabling users to employ the
> > systems/
> > tools that support the mode that is suitable for their use case.
> > However, 
> > baking such a choice into a specification might be a bad idea, and
> > I am 
> > interested in peoples' opinions about this.
> > 
> > By the way, yet another alternative would be to combine the ideas
> > of both 
> > modes and enable users to make explicit for each nested triple
> > whether this 
> > nested triple is meant to represent an annotation of an asserted
> > triple (like 
> > in PG mode) or a statement about a non-asserted triple (like in SA
> > mode). 
> > While this alternative would render the notion of having separate
> > modes 
> > obsolete, it would require extending the notation in the abstract
> > syntax (and 
> > then also in concrete user-focused syntaxes). So, perhaps we should
> > not go 
> > there at this point and, instead, first try to get a better
> > understanding of 
> > the pros and cons of the two separate modes.
> > 
> > Therefore, I would like to get your opinions on the following
> > questions.
> > What do you think are the merits of the PG mode versus the SA mode,
> > and vice 
> > versa?
> > Do you have a clear preference for one mode over the other?
> > What do you think about introducing both modes in a specification
> > of the RDF*/
> > SPARQL* approach?
> > 
> > Thanks,
> > Olaf
> > 
> 
>
Received on Friday, 20 September 2019 09:16:39 UTC