Re: Proposal by Kurt from Kurt Cagle on 2024-07-23 (public-rdf-star-wg@w3.org from July 2024)

From: Kurt Cagle <kurt.cagle@gmail.com>
Date: Tue, 23 Jul 2024 09:30:14 -0700
To: Franconi Enrico <franconi@inf.unibz.it>
Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <CALm0LSEiQAzD3qA-PCa38ObVfJyqzArFoN+QPyqucAPYUe7ybA@mail.gmail.com>
Franconi,

Thank you for taking the time to read the proposal.

The notation that I have introduced is syntactical, rather than semantic.
Indeed, let's say that the changed notation is called Terrapin, and there
is a Terrapin preprocessor applied to Turtle that translates:

:liz [:married1 => tpn:pointerTo :married ;
                    :hasInterval [ _:interval1 => :start 1964 ; :end 1974]]
     :richard .

into

:liz :married1 :richard ..
:married1 tpn:property :married .
:married1 :hasInterval :_:interval1.
:interval1 :start 1964 .
_:interval1 :end 1964 .

where tpn: is a namespace analog for rdf: just to make sure we're not
misusing rdf here.

:married1 in this context is a singleton property.

What this means is that it is unique (it is an IRI). That means that if I
say:

:liz :married1 :richard .
:married1 tpn:property :married .

and

:erin :married1 :richard .
:married1 tpn:property :married .

I have constructed a logically inconsistent statement in RDF, one that can
be caught by creating an exception in OWL or SHACL that indicates that if I
have such a tpn:pointer, then this should generate at a minimum a warning
that you have a singleton property being used with two different
subject/object pairs

Put another way, the => notation doesn't resolve the underlying conflict,
in the case of a singleton property, but it is not intended to. A singleton
property by definition can only apply to one subject/object pair.

Now, in your example,

A1 owl:sameAs A2 .
<—>
A1 [ B  => C1 D1 ] E .
A2 [ B  => C2 D2 ] E .

expands to:

A1 owl:sameAs A2 . #Line 1
A1 B E . # Line 2
A2 B E .  # Line 3
B C1 D1 ; C2 D2 .

If you go back to the previous assertion that I made - a singleton property
can have only one distinct <S,O> , then when you parse the above Turtle,
Line 2 and 3 together should generate a compilation error.

The owl:sameAs assertion in this case is not the same as actually making
the two URIs the same except when you have owl:sameAs exposed as a
generated inference (which, in effect, generates the permutations of all
possible triples involving A1 and A2). Most systems don't use owl:sameAs
for precisely that reason. However, assuming that they did, then the
statements above should realistically collapse down to

A1 B E .
B C1 D1; C2 D2 .

Again, keep in mind that in the case here B is a singleton property, B
should only apply to A1 and E (or put another way,  if I assert

A1 B E .

and B is a singleton, I cannot then assert

A2 B E  .

I can, however, assert:

A1 B1 E .
A2 B2 E .
B1 owl:sameAs B2 .

How do I know THAT B1 and B2 are singletons? Because I also have to add the
assertions:
 B1 tpn:property B .
 B2 tpn:property B .

where B is a vanilla predicate.

Think of a singleton property as actually being a pointer to a property. I
can name that property (which is what the => notation does), but the naming
is orthogonal to the fact that the pointer itself is both unique and can
only have one unique <S,O> per pointer. RDF explicitly states that if two
triples have the same <S,P,O> they are the same triple, period. This is why
you MUST have singleton properties. All that the named node expressions do
in that regard is to make it possible to name these singleton nodes in a
more readable manner.

By the way, this is about the ONLY way that I can see dealing with temporal
RDF.

Consider the following:

Country:USA Country:hasPresident Person:JoeBiden .
Country:USA Country:hasPresident Person:DonaldTrump .
Country:USA Country:hasPresident Person:KamalaHarris .

All of these assertions are true, but only in the right context. If you
want to determine this context, you either have to create multiple
third-normal form assertions, or you have to resort to singleton properties:

Country:USA [:hp1 => tpn:property Country:hasPresident ; :start 2021; :end
2025] Person:JoeBiden .
Country:USA [:hp2 => tpn:property Country:hasPresident ; :start 2017; :end
2021] Person:DonaldTrump .
Country:USA [:hp3 => tpn:property Country:hasPresident ; :start 2025; :end
2029] Kamala Harris.

In SPARQL this is trivial to resolve:

select ?president ?start ?end where {
     Country:USA ?singleton ?president .
     ?singleton tpn:property Country:hasPresident .
      ?singleton :start ?start .
      ?singleton :end ?end .
      filter(now() >= ?start && now() < ?end)
}


















*Kurt Cagle*
Editor in Chief
The Cagle Report
kurt.cagle@gmail.com
443-837-8725 <http://voice.google.com/calls?a=nc,%2B14438378725>


On Tue, Jul 23, 2024 at 2:27 AM Franconi Enrico <franconi@inf.unibz.it>
wrote:

> Hi Kurt,
> I’m still waiting for a reply to my comments from two months ago about
> your proposal (Word document attached), which you presented again at our
> last meeting.
> I have seen that you have posted
> <https://www.linkedin.com/feed/update/urn:li:activity:7219457379564670976/> now
> your proposal in LinkedIn.
> Let me rephrase the comments again, hoping you will react to them.
>
> *Named Node in the Predicate Position*
>
> Your example:
>
> :liz [_:married1 => rdf:subPropertyOf :married ;
>                     :hasInterval [ _:interval1 => :start 1964 ; :end
> 1974]]
>      :richard .
>
> :married1 must be a *singleton* property.
> This option which has been discussed and dismissed some time ago in the
> RDF-star WG.
> This introduces owl:sameAs, leading to serious implementation problems.
> Indeed, the following equivalence pattern holds:
>
> A1 owl:sameAs A2 .
> <—>
> A1 [ B  => C1 D1 ] E .
> A2 [ B  => C2 D2 ] E .
>
> Moreover, the singleton property does not express directly the multi-edge
> case, since you have to name each edge of the same type with a distinct
> name.
>
> From the current RDF-star baseline, the example can be written in Turtle:
>
> << _:marriage1 | :liz :married :richard >>
>    a :marriage ;
>    :hasInterval [:start 1964 ; :end 1974] .
>
> This corresponds to the following in N-Triples:
>
> _:marriage1 rdf:reifies <<( :liz :married :richard )>> .
> _:marriage1 rdf:type :marriage .
> _:marriage1 hasInterval  _:interval1 .
> _:interval1 :start 1965 .
> _:interval1 :end 1974 .
>
> *Reifier Expression*
>
> This is just a rephrase of "option 1” (old style 1.1 reification) we
> discussed and and dismissed some time ago in the RDF-star WG.
> It has severe drawbacks, e.g., in reconstructing the reifier back from the
> three reification triples.
>
> cheers
> —e.
>
> On 24 May 2024, at 15:55, Franconi Enrico <franconi@inf.unibz.it> wrote:
>
> Hi Kurt,
> It seems to me that your proposal is a rephrase of various discussions we
> already had, and ruled out.
>
> *Named Node in the Predicate Position*: this seems to be just a rephrase
> of the singleton property - once you try to give semantics to it. Observe
> that, wrt the current status of the discussion, your proposal does not
> express directly the multi-edge case, since you have to name each edge of
> the same type with a distinct name.
>
> *Reifier Expression*: this is just a rephrase of "option 1" we discussed
> and ruled out some time ago. It has severe drawbacks in reconstructing the
> reifier back from the three reification triples.
>
> cheers
> —e.
>
> On 23 May 2024, at 19:24, Kurt Cagle <kurt.cagle@gmail.com> wrote:
>
> I've attached a document that covers YET ANOTHER proposal (more properly a
> recommendation I've made before).
>
> There are two issues that we seem to be rehashing here. The first is the
> question of reificational notation, while the second has to do with LPG
> harmonization. My contention is that these are different issues, though we
> can use similar notation for both.
>
> *Reification*
>
> A named reification is simply a set of statements:
>
> :r rdf:subject :s; rdf:predicate :p; rdf:object :o .
>
> This is not a triple. It is three statements about the state that a triple
> can be in. It does not introduce a triple into the system,it makes no
> assertions about the truthiness or even, by itself existence of that
> triple. It is simply a statement about the components that a triple might
> have. You cannot reason with it directly, though you can use other
> processes (SPARQL, SHACL, etc.) to construct or verify the existence of
> triples for which these assertions are true. Properly speaking, the above
> itself should probably be qualified:
>
> :r rdf:subject :s; rdf:predicate :p; rdf:object :o ; a rdf:Reification .
>
> The notation << :r | :s :p :o >> makes the above statement more compact,
> but the reification can apply to any triples within a system, or none at
> all, regardless of the values.
>
> *Named Node Expressions*
>
> I propose, in the attached, that we use a similar nomenclature for what
> I'm turning named node expressions, to whit:
>
> [ ?nn | :p1 :o1 ; :p2 :o2 ]
>
> where ?nn is replaced by a formal (not blank) IRI.
>
> This is a Turtle (not RDF) syntactical amendment. The above takes what
> would ordinarily be a blank node and replaces it with a named node:
>
> For instance:
>
> :liz :hasMarriage [ :marriage 1 | :to :Ricard, :start "1965" ; :end "1975"
> ].
>
> which expands to:
>
> :liz :hasMarriage  :marriage 1 .
> :marriage 1 :to :Richard .
> :marriage 1  :start "1965" .
> :marriage 1   :end "1975" .
>
> Why is this important? Because the blank node is a pointer to a data
> structure, but use of the [] notation makes it impossible to reference that
> data structure from within Turtle. By adding in a named node as the
> referencing node, you gain that ability, and it is a key ability for
> modeling.
>
> For instance, I can use the expression:
>
> :liz :hasMarriage [ :marriage 1 | :start "1965" ; :end "1975"; :to
> :richard ], [ :marriage 2 | :start "1975" ; :end "1985"; :to :john].
>
> This is semantically equivalent to the JSON
>
> {"liz":{"hasMarriage":[{"marriage1":{"start":"1965",
> "end":"1975","to":"richard"}},"marriage1":{"start":"1965",
> "end":"1975","to":"richard"}}]}}
>
> The same thing can be done with both predicate-positioned named node
> expressions and subject-oriented ones.
>
> This addresses the LPG equivalency relationship, and does so without ever
> touching reifications.
>
> Note that this also highlights an important point. Blank nodes are useful
> because they are unique and system-assigned. However, they are not
> referenceable. The Turtle notation:
>
> :liz :hasMarriage _:b1, _:b2 .
> _:b1 :start "1965" ; :end "1975"; :to :richard .
> _:b2 :start "1975" ; :end "1985"; :to :john .
>
> is simply a preprocessor directive to replace the "named" nodes with
> anonymous IRIs in the final indexing.  You still have to make _:b1 and _:b2
> unique, or the data structures disintegrate.
>
> Anyway, I ask the chair for time during our next meeting to discuss this
> proposal.
>
> *Kurt Cagle*
> Editor in Chief
> The Cagle Report
> kurt.cagle@gmail.com
> 443-837-8725
>
>
>
Received on Tuesday, 23 July 2024 16:30:47 UTC