- From: Olaf Hartig <olaf.hartig@liu.se>
- Date: Wed, 02 Sep 2020 10:04:33 +0200
- To: public-rdf-star@w3.org
- Cc: Jeen Broekstra <jb@metaphacts.com>, Holger Knublauch <holger@topquadrant.com>
Dear all, I am intrigued by the idea to add a second option to the Turtle* syntax such that PG mode and SA mode can be explicitly distinguished from one another. In fact, now I wonder whether such a distinction can even be built into the RDF* data model (it can, see below). The only issue with this new "PG mode option" for Turtle* is that it works only for annotations that have the annotated triple in the subject position. Take, for instance, Holger's example: > ... instead of > > :bob :age 23 . > <<:bob :age 23>> :certainty 0.9 . > > we can simply (alternatively) write > > :bob :age 23 {| :certainty 0.9 |} . That works. However, the following Turtle* expression (assuming SA mode) cannot be written by using the proposed alternative syntax option. :bob :age 23 . :alice :disbelieves <<:bob :age 23>> . Perhaps this limitation is not an issue. The notion of edge properties in Property Graphs has the same limitation after all. What do you think? Ignoring this potential issue, I thought a bit about my question from above: Can such an explicit distinction between SA mode and PG mode be built into the RDF* data model itself? Here is an idea: Take the notion of an RDF* graph as defined in my papers (i.e., a set of nested triples), but now interpreted under the SA mode assumption (i.e., a triple that appears inside another triple is not considered to be asserted, unless it also contained directly in the RDF* graph). Now, to integrate PG mode explicitly, we may define the notion of an "annotated RDF* graph" as a pair (G,a) where G is an RDF* graph and a is a partial function that maps some (potentially nested) triples in G to finite, nonempty sets of pairs (p,o) with p an IRI and o an RDF term (IRI, blank node, or literal). For instance, the aforementioned alternative Turtle* expression that Holger proposes can be captured by an annotated RDF* graph H=(G,a) such that G = { (:bob,:age,23) } dom(a) = { (:bob,:age,23) } a( (:bob,:age,23) ) = { (:certainty,0.9) } . Clearly, this is only meant to be an abstract syntax for formalizing the (extended) RDF* data model. A similar extension can then be defined for the abstract syntax of SPARQL. Here, the notion of a BGP* can be extended into an "annotated BGP*" that is a pair (B,a) where B is a BGP and a is a partial function that maps some triple* patterns in B to finite, nonempty sets of pairs (p,o) with p an IRI or a variable and o an RDF term or a variable. As an example, the annotated BGP* (B,a) with G = { (?x,:age,23) } dom(a) = { (?x,:age,23) } a( (?x,:age,23) ) = { (:certainty,?y) } captures the following WHERE clause of the user-facing syntax of SPARQL* with Holger's proposed alternative writing for PG mode: WHERE { ?x :age 23 {| :certainty ?y |} . } Four more observations about my idea to extend the abstract syntax of the RDF* data model as described above: 1/ Observe that the function a maps to sets of pairs (p,o) rather than to single pairs. This is necessary to be able to annotate an RDF* triple with multiple key-value pairs, as is possible with Holger's proposed extension of Turtle*: :bob :age 23 {| :certainty 0.9 |} . :bob :age 23 {| :source http://bob.name/index.html |} . Now, we may even consider an option to shorten this extended Turtle* expression as follows: :bob :age 23 {| :certainty 0.9 ; :source http://bob.name/index.html |} . 2/ It is not difficult to define mappings that map an annotated RDF* graph into an RDF* graph and vice versa (and, similarly, for annotated BGP*s). For instance, the annotated RDF* graph H in the example above can be mapped to the following RDF* graph: G' = { (:bob,:age,23), ((:bob,:age,23),:certainty,0.9) } . Such mappings provide a formal foundation for SA mode systems to support Holger's proposed PG-focused extension of Turtle* and SPARQL*. 3/ Related to these mappings, it is now also possible to introduce a notion of a "redundancy-free annotated RDF* graph" (G,a) which satisfies the constraint that none of the annotations in the function a is also captured as a nested triple in G. For instance, the aforementioned annotated RDF* graph H is redundancy free, whereas H'=(G',a') with G' = { (:bob,:age,23), ((:bob,:age,23),:certainty,0.9) } dom(a') = { (:bob,:age,23) } a'( (:bob,:age,23) ) = { (:certainty,0.9) } is not. 4/ Continuing on this line of thought, another possible restriction regarding the notion of annotated RDF* graphs can be the following: Given an annotated RDF* graph (G,a), if G does not contain any nested RDF* triples (i.e., G happens to be a pure RDF graph), then we call (G,a) an "annotated RDF graph." Now, this restriction may be the formal foundation for systems that support only the PG mode! I am looking forward to hearing your thoughts about these ideas to integrate the explicit distinction between SA mode and PG mode into the abstract RDF* data model. Thanks, Olaf On onsdag 2 september 2020 kl. 11:07:56 CEST Jeen Broekstra wrote: > > Yes, I think so and apologies if I didn't communicate this clearly. > > > > The point here is to *add* an alternative short cut so that instead of > > > > :bob :age 23 . > > > > <<:bob :age 23>> :certainty 0.9 . > > > > we can simply (alternatively) write > > > > :bob :age 23 {| :certainty 0.9 |} . > > > > This would serve as syntactic sugar for the (common) use case of both > > asserting and annotating a triple, while still allowing free-standing > > annotations. The short cut will not only make files significantly shorter, > > but also make editing more user-friendly. The cost is for implementers > > though, who would have to cover an additional case (both in parser and > > serializer). > > Thanks for clarifying - in that case I think it's actually a very good > idea. The main issue I see with supporting it is in the serialization side, > which will be tricky to do for any streaming writer. However, we could > support that kind of thing under the moniker of "pretty printing", which is > already something that requires buffering anyway. > > Jeen
Received on Wednesday, 2 September 2020 08:04:56 UTC