Re: Proposal by Kurt from Franconi Enrico on 2024-07-25 (public-rdf-star-wg@w3.org from July 2024)

From: Franconi Enrico <franconi@inf.unibz.it>
Date: Thu, 25 Jul 2024 09:10:38 +0000
To: Kurt Cagle <kurt.cagle@gmail.com>
CC: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <1897DD0C-E39E-4680-B72F-0189D2B0A65D@inf.unibz.it>

On 23 Jul 2024, at 18:30, Kurt Cagle <kurt.cagle@gmail.com> wrote:

Franconi,

Cagle,

The notation that I have introduced is syntactical, rather than semantic. Indeed, let's say that the changed notation is called Terrapin, and there is a Terrapin preprocessor applied to Turtle that translates:

:liz [:married1 => tpn:pointerTo :married ;
                    :hasInterval [ _:interval1 => :start 1964 ; :end 1974]]
     :richard .

into

:liz :married1 :richard ..
:married1 tpn:property :married .
:married1 :hasInterval :_:interval1.
:interval1 :start 1964 .
_:interval1 :end 1964 .

where tpn: is a namespace analog for rdf: just to make sure we're not misusing rdf here.

:married1 in this context is a singleton property.

What this means is that it is unique (it is an IRI). That means that if I say:

:liz :married1 :richard .
:married1 tpn:property :married .

and

:erin :married1 :richard .
:married1 tpn:property :married .

I have constructed a logically inconsistent statement in RDF, one that can be caught by creating an exception in OWL or SHACL that indicates that if I have such a tpn:pointer, then this should generate at a minimum a warning that you have a singleton property being used with two different subject/object pairs

No!
If :married1 is a singleton property, you don’t get an inconsistent graph, but you get an equality between the denotation of :erin and the denotation of :liz, namely you entail :erin owl:sameAs :liz.
This is easy to see that if you understand the semantics of IRIs and the semantics of singleton properties (check the original paper [1] where the semantics is explained in great detail).

You can not have your proposal by just considering it as a syntactic preprocessing of Terrapin. It would violate the basic understanding of the *semantic* web stack principles.

cheers
—e.

[1] https://dl.acm.org/doi/10.1145/2566486.2567973



Put another way, the => notation doesn't resolve the underlying conflict, in the case of a singleton property, but it is not intended to. A singleton property by definition can only apply to one subject/object pair.

Now, in your example,

A1 owl:sameAs A2 .
<—>
A1 [ B  => C1 D1 ] E .
A2 [ B  => C2 D2 ] E .

expands to:

A1 owl:sameAs A2 . #Line 1
A1 B E . # Line 2
A2 B E .  # Line 3
B C1 D1 ; C2 D2 .

If you go back to the previous assertion that I made - a singleton property can have only one distinct <S,O> , then when you parse the above Turtle, Line 2 and 3 together should generate a compilation error.

The owl:sameAs assertion in this case is not the same as actually making the two URIs the same except when you have owl:sameAs exposed as a generated inference (which, in effect, generates the permutations of all possible triples involving A1 and A2). Most systems don't use owl:sameAs for precisely that reason. However, assuming that they did, then the statements above should realistically collapse down to

A1 B E .
B C1 D1; C2 D2 .

Again, keep in mind that in the case here B is a singleton property, B should only apply to A1 and E (or put another way,  if I assert

A1 B E .

and B is a singleton, I cannot then assert

A2 B E  .

I can, however, assert:

A1 B1 E .
A2 B2 E .
B1 owl:sameAs B2 .

How do I know THAT B1 and B2 are singletons? Because I also have to add the assertions:
 B1 tpn:property B .
 B2 tpn:property B .

where B is a vanilla predicate.

Think of a singleton property as actually being a pointer to a property. I can name that property (which is what the => notation does), but the naming is orthogonal to the fact that the pointer itself is both unique and can only have one unique <S,O> per pointer. RDF explicitly states that if two triples have the same <S,P,O> they are the same triple, period. This is why you MUST have singleton properties. All that the named node expressions do in that regard is to make it possible to name these singleton nodes in a more readable manner.

By the way, this is about the ONLY way that I can see dealing with temporal RDF.

Consider the following:

Country:USA Country:hasPresident Person:JoeBiden .
Country:USA Country:hasPresident Person:DonaldTrump .
Country:USA Country:hasPresident Person:KamalaHarris .

All of these assertions are true, but only in the right context. If you want to determine this context, you either have to create multiple third-normal form assertions, or you have to resort to singleton properties:

Country:USA [:hp1 => tpn:property Country:hasPresident ; :start 2021; :end 2025] Person:JoeBiden .
Country:USA [:hp2 => tpn:property Country:hasPresident ; :start 2017; :end 2021] Person:DonaldTrump .
Country:USA [:hp3 => tpn:property Country:hasPresident ; :start 2025; :end 2029] Kamala Harris.

In SPARQL this is trivial to resolve:

select ?president ?start ?end where {
     Country:USA ?singleton ?president .
     ?singleton tpn:property Country:hasPresident .
      ?singleton :start ?start .
      ?singleton :end ?end .
      filter(now() >= ?start && now() < ?end)
}


















Kurt Cagle
Editor in Chief
The Cagle Report
kurt.cagle@gmail.com<mailto:kurt.cagle@gmail.com>
443-837-8725<http://voice.google.com/calls?a=nc,%2B14438378725>


On Tue, Jul 23, 2024 at 2:27 AM Franconi Enrico <franconi@inf.unibz.it<mailto:franconi@inf.unibz.it>> wrote:
Hi Kurt,
I’m still waiting for a reply to my comments from two months ago about your proposal (Word document attached), which you presented again at our last meeting.
I have seen that you have posted<https://www.linkedin.com/feed/update/urn:li:activity:7219457379564670976/> now your proposal in LinkedIn.
Let me rephrase the comments again, hoping you will react to them.

Named Node in the Predicate Position

Your example:

:liz [_:married1 => rdf:subPropertyOf :married ;
                    :hasInterval [ _:interval1 => :start 1964 ; :end 1974]]
     :richard .

:married1 must be a *singleton* property.
This option which has been discussed and dismissed some time ago in the RDF-star WG.
This introduces owl:sameAs, leading to serious implementation problems.
Indeed, the following equivalence pattern holds:

A1 owl:sameAs A2 .
<—>
A1 [ B  => C1 D1 ] E .
A2 [ B  => C2 D2 ] E .

Moreover, the singleton property does not express directly the multi-edge case, since you have to name each edge of the same type with a distinct name.

From the current RDF-star baseline, the example can be written in Turtle:

<< _:marriage1 | :liz :married :richard >>
   a :marriage ;
   :hasInterval [:start 1964 ; :end 1974] .

This corresponds to the following in N-Triples:

_:marriage1 rdf:reifies <<( :liz :married :richard )>> .
_:marriage1 rdf:type :marriage .
_:marriage1 hasInterval  _:interval1 .
_:interval1 :start 1965 .
_:interval1 :end 1974 .

Reifier Expression

This is just a rephrase of "option 1” (old style 1.1 reification) we discussed and and dismissed some time ago in the RDF-star WG.
It has severe drawbacks, e.g., in reconstructing the reifier back from the three reification triples.

cheers
—e.

On 24 May 2024, at 15:55, Franconi Enrico <franconi@inf.unibz.it<mailto:franconi@inf.unibz.it>> wrote:

Hi Kurt,
It seems to me that your proposal is a rephrase of various discussions we already had, and ruled out.

Named Node in the Predicate Position: this seems to be just a rephrase of the singleton property - once you try to give semantics to it. Observe that, wrt the current status of the discussion, your proposal does not express directly the multi-edge case, since you have to name each edge of the same type with a distinct name.

Reifier Expression: this is just a rephrase of "option 1" we discussed and ruled out some time ago. It has severe drawbacks in reconstructing the reifier back from the three reification triples.

cheers
—e.






On 23 May 2024, at 19:24, Kurt Cagle <kurt.cagle@gmail.com<mailto:kurt.cagle@gmail.com>> wrote:

I've attached a document that covers YET ANOTHER proposal (more properly a recommendation I've made before).

There are two issues that we seem to be rehashing here. The first is the question of reificational notation, while the second has to do with LPG harmonization. My contention is that these are different issues, though we can use similar notation for both.

Reification

A named reification is simply a set of statements:

:r rdf:subject :s; rdf:predicate :p; rdf:object :o .

This is not a triple. It is three statements about the state that a triple can be in. It does not introduce a triple into the system,it makes no assertions about the truthiness or even, by itself existence of that triple. It is simply a statement about the components that a triple might have. You cannot reason with it directly, though you can use other processes (SPARQL, SHACL, etc.) to construct or verify the existence of triples for which these assertions are true. Properly speaking, the above itself should probably be qualified:

:r rdf:subject :s; rdf:predicate :p; rdf:object :o ; a rdf:Reification .

The notation << :r | :s :p :o >> makes the above statement more compact, but the reification can apply to any triples within a system, or none at all, regardless of the values.

Named Node Expressions

I propose, in the attached, that we use a similar nomenclature for what I'm turning named node expressions, to whit:

[ ?nn | :p1 :o1 ; :p2 :o2 ]

where ?nn is replaced by a formal (not blank) IRI.

This is a Turtle (not RDF) syntactical amendment. The above takes what would ordinarily be a blank node and replaces it with a named node:

For instance:

:liz :hasMarriage [ :marriage 1 | :to :Ricard, :start "1965" ; :end "1975" ].

which expands to:

:liz :hasMarriage  :marriage 1 .
:marriage 1 :to :Richard .
:marriage 1  :start "1965" .
:marriage 1   :end "1975" .

Why is this important? Because the blank node is a pointer to a data structure, but use of the [] notation makes it impossible to reference that data structure from within Turtle. By adding in a named node as the referencing node, you gain that ability, and it is a key ability for modeling.

For instance, I can use the expression:

:liz :hasMarriage [ :marriage 1 | :start "1965" ; :end "1975"; :to :richard ], [ :marriage 2 | :start "1975" ; :end "1985"; :to :john].

This is semantically equivalent to the JSON

{"liz":{"hasMarriage":[{"marriage1":{"start":"1965", "end":"1975","to":"richard"}},"marriage1":{"start":"1965", "end":"1975","to":"richard"}}]}}

The same thing can be done with both predicate-positioned named node expressions and subject-oriented ones.

This addresses the LPG equivalency relationship, and does so without ever touching reifications.

Note that this also highlights an important point. Blank nodes are useful because they are unique and system-assigned. However, they are not referenceable. The Turtle notation:

:liz :hasMarriage _:b1, _:b2 .
_:b1 :start "1965" ; :end "1975"; :to :richard .
_:b2 :start "1975" ; :end "1985"; :to :john .

is simply a preprocessor directive to replace the "named" nodes with anonymous IRIs in the final indexing.  You still have to make _:b1 and _:b2 unique, or the data structures disintegrate.

Anyway, I ask the chair for time during our next meeting to discuss this proposal.

Kurt Cagle
Editor in Chief
The Cagle Report
kurt.cagle@gmail.com<mailto:kurt.cagle@gmail.com>
443-837-8725
Received on Thursday, 25 July 2024 09:10:46 UTC