- From: Olaf Hartig <olaf.hartig@liu.se>
- Date: Wed, 02 Sep 2020 10:04:33 +0200
- To: public-rdf-star@w3.org
- Cc: Jeen Broekstra <jb@metaphacts.com>, Holger Knublauch <holger@topquadrant.com>
Dear all,
I am intrigued by the idea to add a second option to the Turtle* syntax such
that PG mode and SA mode can be explicitly distinguished from one another. In
fact, now I wonder whether such a distinction can even be built into the RDF*
data model (it can, see below).
The only issue with this new "PG mode option" for Turtle* is that it works
only for annotations that have the annotated triple in the subject position.
Take, for instance, Holger's example:
> ... instead of
>
> :bob :age 23 .
> <<:bob :age 23>> :certainty 0.9 .
>
> we can simply (alternatively) write
>
> :bob :age 23 {| :certainty 0.9 |} .
That works. However, the following Turtle* expression (assuming SA mode)
cannot be written by using the proposed alternative syntax option.
:bob :age 23 .
:alice :disbelieves <<:bob :age 23>> .
Perhaps this limitation is not an issue. The notion of edge properties in
Property Graphs has the same limitation after all. What do you think?
Ignoring this potential issue, I thought a bit about my question from above:
Can such an explicit distinction between SA mode and PG mode be built into the
RDF* data model itself?
Here is an idea: Take the notion of an RDF* graph as defined in my papers
(i.e., a set of nested triples), but now interpreted under the SA mode
assumption (i.e., a triple that appears inside another triple is not
considered to be asserted, unless it also contained directly in the RDF*
graph). Now, to integrate PG mode explicitly, we may define the notion of an
"annotated RDF* graph" as a pair (G,a) where G is an RDF* graph and a is a
partial function that maps some (potentially nested) triples in G to finite,
nonempty sets of pairs (p,o) with p an IRI and o an RDF term (IRI, blank node,
or literal). For instance, the aforementioned alternative Turtle* expression
that Holger proposes can be captured by an annotated RDF* graph H=(G,a) such
that
G = { (:bob,:age,23) }
dom(a) = { (:bob,:age,23) }
a( (:bob,:age,23) ) = { (:certainty,0.9) } .
Clearly, this is only meant to be an abstract syntax for formalizing the
(extended) RDF* data model.
A similar extension can then be defined for the abstract syntax of SPARQL.
Here, the notion of a BGP* can be extended into an "annotated BGP*" that is a
pair (B,a) where B is a BGP and a is a partial function that maps some triple*
patterns in B to finite, nonempty sets of pairs (p,o) with p an IRI or a
variable and o an RDF term or a variable. As an example, the annotated BGP*
(B,a) with
G = { (?x,:age,23) }
dom(a) = { (?x,:age,23) }
a( (?x,:age,23) ) = { (:certainty,?y) }
captures the following WHERE clause of the user-facing syntax of SPARQL* with
Holger's proposed alternative writing for PG mode:
WHERE {
?x :age 23 {| :certainty ?y |} .
}
Four more observations about my idea to extend the abstract syntax of the RDF*
data model as described above:
1/ Observe that the function a maps to sets of pairs (p,o) rather than to
single pairs. This is necessary to be able to annotate an RDF* triple with
multiple key-value pairs, as is possible with Holger's proposed extension of
Turtle*:
:bob :age 23 {| :certainty 0.9 |} .
:bob :age 23 {| :source http://bob.name/index.html |} .
Now, we may even consider an option to shorten this extended Turtle*
expression as follows:
:bob :age 23 {| :certainty 0.9 ;
:source http://bob.name/index.html |} .
2/ It is not difficult to define mappings that map an annotated RDF* graph
into an RDF* graph and vice versa (and, similarly, for annotated BGP*s). For
instance, the annotated RDF* graph H in the example above can be mapped to the
following RDF* graph:
G' = { (:bob,:age,23), ((:bob,:age,23),:certainty,0.9) } .
Such mappings provide a formal foundation for SA mode systems to support
Holger's proposed PG-focused extension of Turtle* and SPARQL*.
3/ Related to these mappings, it is now also possible to introduce a notion of
a "redundancy-free annotated RDF* graph" (G,a) which satisfies the constraint
that none of the annotations in the function a is also captured as a nested
triple in G. For instance, the aforementioned annotated RDF* graph H is
redundancy free, whereas H'=(G',a') with
G' = { (:bob,:age,23), ((:bob,:age,23),:certainty,0.9) }
dom(a') = { (:bob,:age,23) }
a'( (:bob,:age,23) ) = { (:certainty,0.9) }
is not.
4/ Continuing on this line of thought, another possible restriction regarding
the notion of annotated RDF* graphs can be the following: Given an annotated
RDF* graph (G,a), if G does not contain any nested RDF* triples (i.e., G
happens to be a pure RDF graph), then we call (G,a) an "annotated RDF graph."
Now, this restriction may be the formal foundation for systems that support
only the PG mode!
I am looking forward to hearing your thoughts about these ideas to integrate
the explicit distinction between SA mode and PG mode into the abstract RDF*
data model.
Thanks,
Olaf
On onsdag 2 september 2020 kl. 11:07:56 CEST Jeen Broekstra wrote:
> > Yes, I think so and apologies if I didn't communicate this clearly.
> >
> > The point here is to *add* an alternative short cut so that instead of
> >
> > :bob :age 23 .
> >
> > <<:bob :age 23>> :certainty 0.9 .
> >
> > we can simply (alternatively) write
> >
> > :bob :age 23 {| :certainty 0.9 |} .
> >
> > This would serve as syntactic sugar for the (common) use case of both
> > asserting and annotating a triple, while still allowing free-standing
> > annotations. The short cut will not only make files significantly shorter,
> > but also make editing more user-friendly. The cost is for implementers
> > though, who would have to cover an additional case (both in parser and
> > serializer).
>
> Thanks for clarifying - in that case I think it's actually a very good
> idea. The main issue I see with supporting it is in the serialization side,
> which will be tricky to do for any streaming writer. However, we could
> support that kind of thing under the moniker of "pretty printing", which is
> already something that requires buffering anyway.
>
> Jeen
Received on Wednesday, 2 September 2020 08:04:56 UTC