Re: weakness of embedded triples from Pavel Klinov on 2020-10-24 (public-rdf-star@w3.org from October 2020)

From: Pavel Klinov <pavel@stardog.com>
Date: Sat, 24 Oct 2020 09:20:13 +0200
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-rdf-star@w3.org
Message-ID: <CAJ-ZGXopRS6SEV-MkTy-P_54NFVYxkp7-gOmU0zJq5aKDT=+Sg@mail.gmail.com>
On Fri, Oct 23, 2020 at 12:49 AM Holger Knublauch <holger@topquadrant.com>
wrote:


> [snip]
>


> >>   and I’d like to have that. However I seem to have a fundamental
> problem understanding the semantic problem (nobody is surprised here).
> Taking a statement that is asserted and annotated in RDF*, in a currently
> en vogue syntax:
> >>
> >>     :a  :b  :c  [[ :d  :e ]] .
> >>
> >> What internal structure are you refering to that has to be taken into
> account? The statement is stated, and annotated, and IIUC that’s about it.
> Ensuring that the annotation refers to this specific statement is a
> syntactic problem and a statement identifier composed of subject, predicate
> and object and encoded in an IRI is a syntactic solution to that problem.
> Paraphrasing your example above:
> >>
> >>     :a  :b  :c .
> >>     :triple:a:b:c  :d  :e .
> >>
> >> What "internal structure" has changed here? In what ways could this
> syntax convey a different meaning than the one above?
> > Again, that is fine when no blank nodes are involved. But if you replace
> > :c with _:x, you would get something like:
> >
> >      :a :b _:x.
> >      :triple:a:b:_:x :d :e.
> >
> > Then, RDF semantics says you can replace _:x with _:y without changing
> > the meaning of the graph. This renaming only impacts the first triple;
> > from the point of view of standard RDF semantics, the second triple
> > contains 3 IRIs, so it is kept unchanged by the renaming. So under
> > standard RDF semantics, we can replace the graph above with:
> >
> >      :a :b _:y.
> >      :triple:a:b:_:x :d :e.
> >
> > and we have lost the connection between the asserted triple and the
> > annotated triple. That's why we need to have an extended semantics for
> RDF*.
>
> Not necessarily. I still think this can be solved by simply declaring
> reification on bnode triples to be unsupported.
>
> Yes there are theoretically some scenarios where this might be useful,
> but I'd rather say "if you want to use RDF*, use IRIs and no bnodes"
> than having to extend the very core model of RDF just for this corner
> case. There are already similar constraints in place in the RDF world,
> e.g. a reified statement cannot appear as predicate, and literals cannot
> be subjects. Life goes on, people get used to these limitations. Relying
> on bnodes for identification is a bad practice anyway, and they already
> don't work across graph boundaries.
>

I imagine it's going to be difficult to agree on a single approach here
which would make everyone happy but it's also important to let vendors
support a useful subset of RDF*/SPARQL* without bothering their
users/customers too much about bnodes. Maybe we can figure out some subsets
of RDF*, e.g. akin to the OWL 2 Profiles, s.t. the simple RDF*
interpretation along the lines that Martynas described would be compatible
with a more complex and comprehensive one as long as bnodes do not appear
in annotated triples (as Holger suggested).

We at Stardog would then support the simple thing and possibly consider the
complex one based on demand (this is generally how we tend to do things).

Cheers,
Pavel


>
> Holger
>
>
> >
> > (more precisely: that's why the trick of encoding embedded triples into
> > IRIs does not work. There might be a smarter encoding of RDF* into RDF,
> > which would allow us to rely on the standard semantics, but I seriously
> > doubt it)
> >
> >> Thomas
> >>
> >>
> >> [0] Aidan Hogan, 2017, Canonical Forms for Isomorphic and Equivalent
> RDF Graphs: Algorithms for Leaning and Labelling Blank Nodes
> >>
> >>
> >>>    best
> >>>
> >>>> Thomas
> >>>>
> >>>>
> >>>> [0] https://www.w3.org/TR/rdf11-concepts/
> >>>>
> >>>>> Also, note that the semantics' goal is not to prescribe a particular
> implementation method; it is to ensure that different implementations
> remain interoperable.
> >>>>>   pa
> >>>>>
> >>>>>> On Mon, Oct 19, 2020 at 11:07 AM Pavel Klinov
> >>>>>> <pavel@stardog.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Yeah right. We have a mechanism in place to avoid using the same
> Skolem constant for bnodes with the same lexical form occurring in multiple
> RDF datasets (eg. when loading multiple files) but that's pretty much it.
> IIRC it's called something like "standardising apart" in one of the RDF
> docs.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Pavel
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Oct 19, 2020 at 10:54 AM Pierre-Antoine Champin
> >>>>>>> <pierre-antoine.champin@ercim.eu>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Dear all,
> >>>>>>>>
> >>>>>>>> Holger, Pavel: I assume that blank nodes are internally
> skolemized, so indeed, internally, you only have IRIs and literals. Correct?
> >>>>>>>>
> >>>>>>>> On 19/10/2020 10:28, Holger Knublauch wrote:
> >>>>>>>>
> >>>>>>>> Similar situation here at TopQuadrant, see
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> http://datashapes.org/reification.html#uriReification
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Holger
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 10/19/2020 6:24 PM, Pavel Klinov wrote:
> >>>>>>>>
> >>>>>>>> This is roughly how Stardog supports RDF* and so far we find it
> sufficient in the enterprise context. It's pretty easily understood by
> users familiar with edge properties in the property graph data model, which
> is one of the most important factors for us.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Pavel
> >>>>>>>>
> >>>>>>>> On Sat, Oct 17, 2020 at 9:54 PM Martynas Jusevičius
> >>>>>>>> <martynas@atomgraph.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Does RDF* need new semantics at all? Couldn't it be a
> syntax-level
> >>>>>>>>> convention for unique triple IDs?
> >>>>>>>>>
> >>>>>>>>> E.g. <<s>, <p>, <o>> being syntactic sugar for
> >>>>>>>>> uri(concat("urn:rdf:id:", hash(str(<s>)), hash(str(<p>)),
> >>>>>>>>> hash(str(<p>)))).
> >>>>>>>>>
> >>>>>>>>> For example, the triple
> >>>>>>>>>
> >>>>>>>>> <
> >>>>>>>>> <https://www.w3.org/People/Berners-Lee/card>
> >>>>>>>>> <http://xmlns.com/foaf/0.1/primaryTopic>
> >>>>>>>>> <https://www.w3.org/People/Berners-Lee/card#i>
> >>>>>>>>> gives
> >>>>>>>>>
> >>>>>>>>> URI(CONCAT("urn:rdf:id:",
> >>>>>>>>> SHA1(STR(
> >>>>>>>>> <https://www.w3.org/People/Berners-Lee/card>
> >>>>>>>>> )),
> >>>>>>>>> SHA1(STR(
> >>>>>>>>> <http://xmlns.com/foaf/0.1/primaryTopic>
> >>>>>>>>> )),
> >>>>>>>>> SHA1(STR(
> >>>>>>>>> <https://www.w3.org/People/Berners-Lee/card#i>
> >>>>>>>>> ))))
> >>>>>>>>>
> >>>>>>>>> gives
> >>>>>>>>>
> >>>>>>>>>
> <urn:rdf:id:63874e34ff5f326e67e888f6818f72d5033ecb343cadd8c2120281d72cefce4481485c937b6a95a656beaa67c13db29f3d7be801328b7c9125976c5f>
> >>>>>>>>>
> >>>>>>>>> which essentially would become the "5th element", in addition to
> quads.
> >>>>>>>>>
> >>>>>>>>> On Thu, Oct 15, 2020 at 1:38 PM Pierre-Antoine Champin
> >>>>>>>>>
> >>>>>>>>> <pierre-antoine.champin@ercim.eu>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> On 14/10/2020 23:13, Peter F. Patel-Schneider wrote:
> >>>>>>>>>>
> >>>>>>>>>> Let's make the height example even more stark.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :height "6.0"^^xsd:decimal >>
> .
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> does not imply
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :height "6.00"^^xsd:decimal
> >> .
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> I would hope that any Tom, Dick, and Lois can realize that
> these two literals
> >>>>>>>>>> are the same.
> >>>>>>>>>>
> >>>>>>>>>> I see your point, but this is really a matter of deciding where
> you put the boundary...
> >>>>>>>>>>
> >>>>>>>>>> So I would still prefer to be radical here and consider any
> lexical difference as potentially significant.
> >>>>>>>>>>
> >>>>>>>>>> If you want to stick to literals that have to be supported in
> RDF
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :name "Clark"@en-US >> .
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> does not imply
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :name "Clark"@en-us >> .
> >>>>>>>>>>
> >>>>>>>>>> Are "Clark"@en-US and "Clark"@en-us really different literals,
> for the abstract syntax??
> >>>>>>>>>>
> >>>>>>>>>> I would have thought they are the same (and so the implication
> above would hold).
> >>>>>>>>>>
> >>>>>>>>>> Reading the spec again, I realize that things are not so clear:
> "Lexical representations of language tags MAY be converted to lower case",
> and then Literal term equality requires that language tags "compare equal,
> character by character". So these 2 literals MAY be considered equal, and
> the implication MAY hold... :-/ Add to this that BCP47 explicitly state
> that language tags are case insensitive... I'd say that we are in gray area
> here.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> peter
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 10/14/20 4:45 PM, Doerthe Arndt wrote:
> >>>>>>>>>>
> >>>>>>>>>> Dear Peter,
> >>>>>>>>>>
> >>>>>>>>>> you are right with both observations. The question is whether
> we want that
> >>>>>>>>>> behavior or not.
> >>>>>>>>>>
> >>>>>>>>>> In
> >>>>>>>>>> https://w3c.github.io/rdf-star/
> >>>>>>>>>> there is a section on referential opacity.
> >>>>>>>>>> The main claim there is that triples are referentially opaque.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> But embedded triples are much weaker than just being
> referntially opaque.  To
> >>>>>>>>>> see this consider the following RDF* graph under the RDF*
> version of RDF
> >>>>>>>>>> entailment recognizing xsd:decimal and xsd:integer.
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:decimal >> .
> >>>>>>>>>>
> >>>>>>>>>> In this semantics "6"^^xsd:decimal means the same as
> "6"^^xsd:integer so one
> >>>>>>>>>> would expect that
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:integer >> .
> >>>>>>>>>>
> >>>>>>>>>> is RDF*-entailed.
> >>>>>>>>>>
> >>>>>>>>>> But it is not.  There are two reasons for this.
> >>>>>>>>>>
> >>>>>>>>>> First, there is no requirement that satisfying interpretations
> for the first
> >>>>>>>>>> graph map < :clarkKent :height "6"^^xsd:integer > to anything
> and if a
> >>>>>>>>>> satisfying interpretation does map the triple there is no
> requirement that its
> >>>>>>>>>> ITEXT mapping gives the triple its correct meaning.  (The value
> of ITEXT for
> >>>>>>>>>> the triple could have the real number pi as its third element.)
> >>>>>>>>>>
> >>>>>>>>>> Second, "6"^^xsd:integer is a different node from
> "6"^^xsd:decimal. So even if
> >>>>>>>>>> the intepretation treats the second embedded triple nicely, and
> thus gives it
> >>>>>>>>>> the same meaning as the first embedded triple, they are still
> two different
> >>>>>>>>>> triples and :loisLane can believe one but not the other.  So
> very little of
> >>>>>>>>>> the semantics of RDF gets into embedded triples.
> >>>>>>>>>>
> >>>>>>>>>> We wanted different that different representations are treated
> differently
> >>>>>>>>>> if they have the same meaning. The reason for that is that we
> expected that
> >>>>>>>>>> RDF* would also be used to make statements about triples as
> they were
> >>>>>>>>>> stated, for example to be able to explain the reasoning
> performed on the
> >>>>>>>>>> triples but also for simple provenance. In these cases there
> should be a
> >>>>>>>>>> difference between
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:decimal >> .
> >>>>>>>>>>
> >>>>>>>>>> and
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:integer >>
> >>>>>>>>>>
> >>>>>>>>>> since we still talk about different representations.
> >>>>>>>>>>
> >>>>>>>>>> Each triple is, in effect, its own context.  So, in an RDFS
> version of RDF*,
> >>>>>>>>>> even if :loisLane believes several triples that should imply
> another, they
> >>>>>>>>>> generally don't.  For example:
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent rdf:type :man >> .
> >>>>>>>>>> :loisLane :believes << :man rdfs:subClassOf :human >> .
> >>>>>>>>>>
> >>>>>>>>>> Does not imply
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent rdf:type :human >> .
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> So embedded triples are incredibly weak in RDF*.   Making them
> useful will
> >>>>>>>>>> likely require quite a bit of work.
> >>>>>>>>>>
> >>>>>>>>>> Here, "useful" depends again on your intended use. We wanted to
> have a
> >>>>>>>>>> rather weak semantics which allows users with more complex use
> cases to add
> >>>>>>>>>> their semantics. It is easier to make the semantics more
> complex by adding
> >>>>>>>>>> extensions than to ignore certain parts. I for example remember
> that Jos De
> >>>>>>>>>> Roo announced some time ago that his EYE reasoner supports
> rules on RDF*. Of
> >>>>>>>>>> course that alone would not allow you to cover all cases, but
> it could be
> >>>>>>>>>> very helpful in practice.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On the other hand, there are some unusual inferences that can
> be made in
> >>>>>>>>>> RDF*.  In an RDF* version of RDFS++ it is possible to state
> that two triples
> >>>>>>>>>> are the same.   The graph
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :superman :can :fly >>.
> >>>>>>>>>> << :superman :can :fly >> owl:sameAs << :clarkKent :can :fly >>
> .
> >>>>>>>>>>
> >>>>>>>>>> is consistent here and implies
> >>>>>>>>>>
> >>>>>>>>>> :superman owl:sameAs :clarkKent .
> >>>>>>>>>> :loisLane :believes << :clarkKent :can :fly >>.
> >>>>>>>>>>
> >>>>>>>>>> This last case is an interesting one. We indeed wanted the
> triple
> >>>>>>>>>>
> >>>>>>>>>> :loisLane :believes << :clarkKent :can :fly >>.
> >>>>>>>>>>
> >>>>>>>>>> to be a consequence of your statements. The question is whether
> >>>>>>>>>>
> >>>>>>>>>> :superman owl:sameAs :clarkKent .
> >>>>>>>>>>
> >>>>>>>>>> should follow (it does indeed follow, just as you describe). We
> made the
> >>>>>>>>>> semantics of embedded triples the way it is to be able to deal
> with blank
> >>>>>>>>>> notes. Here, I can't give a concrete answer whether (at least
> to my
> >>>>>>>>>> understanding) it should be that way. I will think about it
> (and read
> >>>>>>>>>> Pierre-Antoine's thoughts in the mean time, which just arrived
> as well) and
> >>>>>>>>>> come back to you.
> >>>>>>>>>>
> >>>>>>>>>> Kind regards,
> >>>>>>>>>> Doerthe
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
>
>
Received on Saturday, 24 October 2020 07:20:40 UTC