Re: weakness of embedded triples

On 10/23/2020 9:58 PM, Pierre-Antoine Champin wrote:
> Hoger,
>
> On 23/10/2020 00:48, Holger Knublauch wrote:
>>> (...)
>>> That's why we need to have an extended semantics for RDF*.
>> Not necessarily. I still think this can be solved by simply declaring
>> reification on bnode triples to be unsupported.
>>
>> Yes there are theoretically some scenarios where this might be useful,
>> but I'd rather say "if you want to use RDF*, use IRIs and no bnodes"
>> than having to extend the very core model of RDF just for this corner
>> case.
> Whether this is a corner case or not remains to be discussed, IMO.
>
> Would you mind creating an issue on github with this proposal; we can
> have a quick +1 / -1 poll on that issue, and see what the community
> thinks...
>
>> There are already similar constraints in place in the RDF world, e.g.
>> a reified statement cannot appear as predicate, and literals cannot be
>> subjects. Life goes on, people get used to these limitations. Relying
>> on bnodes for identification is a bad practice anyway,
> I don't think there is a consensus about that either. As you know, there
> has been a resurgence of this permathread last summer, and  Brickley
> made, I think, a very good point in the defense of bnodes.
>
> https://lists.w3.org/Archives/Public/semantic-web/2020Jul/0003.html
>
> In fact, I could return your argument above : some people (me included)
> would like literals as subjects, but they are not allowed; life goes on.
> Some people do not like bnodes, but they are part of RDF; life goes on.
>
> But, granted, that does not mean that we need to "propagate" them into
> new RDF features if we collectively decide that we don't need to. Hence
> my proposal above to create a specific issue about that.

I am not sure I would want to go this far. My point that bnodes could be 
declared off-limits was mainly in response to the technical issues with 
generating long URIs. However, those long URIs do not even need require 
this limitation. From a programming perspective, most APIs have some 
kind of function such as Node.getURI() returning a string. I believe a 
decent implementation strategy for RDF* that would not require 
introducing a new node type (for triples) would be to simply have those 
APIs implement a different subclass of Nodes. For example, assume a Java 
class in some imaginary RDF API such as

class URINode extends Node {
     private String uri;

     public String getURI() {
         return uri;
     }

     public boolean isURI() {
         return true;
     }
}

class TripleNode extends Node {
     private Node subject;
     private Node predicate;
     private Node object;

     public String getURI() {
         // Don't store URI string but compute it when needed (which is 
probably not often the case)
         // This string could of course also be cached if it's a 
performance problem
         return "urn:triple:" + encode(subject) + ":" + 
encode(predicate) + ":" + encode(object);
     }

     public boolean isURI() {
         return true;
     }
     ...
}

An RDF* parser could produce these nodes from the input document. Then, 
if the file is reloaded, the bnode IDs might differ yet the reference to 
them remain identical.

Wouldn't this give the best of both worlds? This doesn't require changes 
to the low levels of RDF and leaves implementations the same room for 
internal optimizations, e.g. an internal index to quickly find all 
TripleNodes with certain subject, predicate or object.

Holger


>
>    best
>
>> and they already don't work across graph boundaries.
>>
>> Holger
>>
>>
>>> (more precisely: that's why the trick of encoding embedded triples into
>>> IRIs does not work. There might be a smarter encoding of RDF* into RDF,
>>> which would allow us to rely on the standard semantics, but I seriously
>>> doubt it)
>>>
>>>> Thomas
>>>>
>>>>
>>>> [0] Aidan Hogan, 2017, Canonical Forms for Isomorphic and Equivalent
>>>> RDF Graphs: Algorithms for Leaning and Labelling Blank Nodes
>>>>
>>>>
>>>>>     best
>>>>>
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>> [0] https://www.w3.org/TR/rdf11-concepts/
>>>>>>
>>>>>>> Also, note that the semantics' goal is not to prescribe a
>>>>>>> particular implementation method; it is to ensure that different
>>>>>>> implementations remain interoperable.
>>>>>>>    pa
>>>>>>>
>>>>>>>> On Mon, Oct 19, 2020 at 11:07 AM Pavel Klinov
>>>>>>>> <pavel@stardog.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yeah right. We have a mechanism in place to avoid using the
>>>>>>>>> same Skolem constant for bnodes with the same lexical form
>>>>>>>>> occurring in multiple RDF datasets (eg. when loading multiple
>>>>>>>>> files) but that's pretty much it. IIRC it's called something
>>>>>>>>> like "standardising apart" in one of the RDF docs.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Pavel
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 19, 2020 at 10:54 AM Pierre-Antoine Champin
>>>>>>>>> <pierre-antoine.champin@ercim.eu>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Dear all,
>>>>>>>>>>
>>>>>>>>>> Holger, Pavel: I assume that blank nodes are internally
>>>>>>>>>> skolemized, so indeed, internally, you only have IRIs and
>>>>>>>>>> literals. Correct?
>>>>>>>>>>
>>>>>>>>>> On 19/10/2020 10:28, Holger Knublauch wrote:
>>>>>>>>>>
>>>>>>>>>> Similar situation here at TopQuadrant, see
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://datashapes.org/reification.html#uriReification
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Holger
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 10/19/2020 6:24 PM, Pavel Klinov wrote:
>>>>>>>>>>
>>>>>>>>>> This is roughly how Stardog supports RDF* and so far we find
>>>>>>>>>> it sufficient in the enterprise context. It's pretty easily
>>>>>>>>>> understood by users familiar with edge properties in the
>>>>>>>>>> property graph data model, which is one of the most important
>>>>>>>>>> factors for us.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Pavel
>>>>>>>>>>
>>>>>>>>>> On Sat, Oct 17, 2020 at 9:54 PM Martynas Jusevičius
>>>>>>>>>> <martynas@atomgraph.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Does RDF* need new semantics at all? Couldn't it be a
>>>>>>>>>>> syntax-level
>>>>>>>>>>> convention for unique triple IDs?
>>>>>>>>>>>
>>>>>>>>>>> E.g. <<s>, <p>, <o>> being syntactic sugar for
>>>>>>>>>>> uri(concat("urn:rdf:id:", hash(str(<s>)), hash(str(<p>)),
>>>>>>>>>>> hash(str(<p>)))).
>>>>>>>>>>>
>>>>>>>>>>> For example, the triple
>>>>>>>>>>>
>>>>>>>>>>> <
>>>>>>>>>>> <https://www.w3.org/People/Berners-Lee/card>
>>>>>>>>>>> <http://xmlns.com/foaf/0.1/primaryTopic>
>>>>>>>>>>> <https://www.w3.org/People/Berners-Lee/card#i>
>>>>>>>>>>> gives
>>>>>>>>>>>
>>>>>>>>>>> URI(CONCAT("urn:rdf:id:",
>>>>>>>>>>> SHA1(STR(
>>>>>>>>>>> <https://www.w3.org/People/Berners-Lee/card>
>>>>>>>>>>> )),
>>>>>>>>>>> SHA1(STR(
>>>>>>>>>>> <http://xmlns.com/foaf/0.1/primaryTopic>
>>>>>>>>>>> )),
>>>>>>>>>>> SHA1(STR(
>>>>>>>>>>> <https://www.w3.org/People/Berners-Lee/card#i>
>>>>>>>>>>> ))))
>>>>>>>>>>>
>>>>>>>>>>> gives
>>>>>>>>>>>
>>>>>>>>>>> <urn:rdf:id:63874e34ff5f326e67e888f6818f72d5033ecb343cadd8c2120281d72cefce4481485c937b6a95a656beaa67c13db29f3d7be801328b7c9125976c5f>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> which essentially would become the "5th element", in addition
>>>>>>>>>>> to quads.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Oct 15, 2020 at 1:38 PM Pierre-Antoine Champin
>>>>>>>>>>>
>>>>>>>>>>> <pierre-antoine.champin@ercim.eu>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On 14/10/2020 23:13, Peter F. Patel-Schneider wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Let's make the height example even more stark.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :height "6.0"^^xsd:decimal
>>>>>>>>>>>>>> .
>>>>>>>>>>>>
>>>>>>>>>>>> does not imply
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :height
>>>>>>>>>>>> "6.00"^^xsd:decimal >> .
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I would hope that any Tom, Dick, and Lois can realize that
>>>>>>>>>>>> these two literals
>>>>>>>>>>>> are the same.
>>>>>>>>>>>>
>>>>>>>>>>>> I see your point, but this is really a matter of deciding
>>>>>>>>>>>> where you put the boundary...
>>>>>>>>>>>>
>>>>>>>>>>>> So I would still prefer to be radical here and consider any
>>>>>>>>>>>> lexical difference as potentially significant.
>>>>>>>>>>>>
>>>>>>>>>>>> If you want to stick to literals that have to be supported
>>>>>>>>>>>> in RDF
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :name "Clark"@en-US >> .
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> does not imply
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :name "Clark"@en-us >> .
>>>>>>>>>>>>
>>>>>>>>>>>> Are "Clark"@en-US and "Clark"@en-us really different
>>>>>>>>>>>> literals, for the abstract syntax??
>>>>>>>>>>>>
>>>>>>>>>>>> I would have thought they are the same (and so the
>>>>>>>>>>>> implication above would hold).
>>>>>>>>>>>>
>>>>>>>>>>>> Reading the spec again, I realize that things are not so
>>>>>>>>>>>> clear: "Lexical representations of language tags MAY be
>>>>>>>>>>>> converted to lower case", and then Literal term equality
>>>>>>>>>>>> requires that language tags "compare equal, character by
>>>>>>>>>>>> character". So these 2 literals MAY be considered equal, and
>>>>>>>>>>>> the implication MAY hold... :-/ Add to this that BCP47
>>>>>>>>>>>> explicitly state that language tags are case insensitive...
>>>>>>>>>>>> I'd say that we are in gray area here.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> peter
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/14/20 4:45 PM, Doerthe Arndt wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Dear Peter,
>>>>>>>>>>>>
>>>>>>>>>>>> you are right with both observations. The question is
>>>>>>>>>>>> whether we want that
>>>>>>>>>>>> behavior or not.
>>>>>>>>>>>>
>>>>>>>>>>>> In
>>>>>>>>>>>> https://w3c.github.io/rdf-star/
>>>>>>>>>>>> there is a section on referential opacity.
>>>>>>>>>>>> The main claim there is that triples are referentially opaque.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> But embedded triples are much weaker than just being
>>>>>>>>>>>> referntially opaque.  To
>>>>>>>>>>>> see this consider the following RDF* graph under the RDF*
>>>>>>>>>>>> version of RDF
>>>>>>>>>>>> entailment recognizing xsd:decimal and xsd:integer.
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:decimal >> .
>>>>>>>>>>>>
>>>>>>>>>>>> In this semantics "6"^^xsd:decimal means the same as
>>>>>>>>>>>> "6"^^xsd:integer so one
>>>>>>>>>>>> would expect that
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:integer >> .
>>>>>>>>>>>>
>>>>>>>>>>>> is RDF*-entailed.
>>>>>>>>>>>>
>>>>>>>>>>>> But it is not.  There are two reasons for this.
>>>>>>>>>>>>
>>>>>>>>>>>> First, there is no requirement that satisfying
>>>>>>>>>>>> interpretations for the first
>>>>>>>>>>>> graph map < :clarkKent :height "6"^^xsd:integer > to
>>>>>>>>>>>> anything and if a
>>>>>>>>>>>> satisfying interpretation does map the triple there is no
>>>>>>>>>>>> requirement that its
>>>>>>>>>>>> ITEXT mapping gives the triple its correct meaning.  (The
>>>>>>>>>>>> value of ITEXT for
>>>>>>>>>>>> the triple could have the real number pi as its third element.)
>>>>>>>>>>>>
>>>>>>>>>>>> Second, "6"^^xsd:integer is a different node from
>>>>>>>>>>>> "6"^^xsd:decimal. So even if
>>>>>>>>>>>> the intepretation treats the second embedded triple nicely,
>>>>>>>>>>>> and thus gives it
>>>>>>>>>>>> the same meaning as the first embedded triple, they are
>>>>>>>>>>>> still two different
>>>>>>>>>>>> triples and :loisLane can believe one but not the other.  So
>>>>>>>>>>>> very little of
>>>>>>>>>>>> the semantics of RDF gets into embedded triples.
>>>>>>>>>>>>
>>>>>>>>>>>> We wanted different that different representations are
>>>>>>>>>>>> treated differently
>>>>>>>>>>>> if they have the same meaning. The reason for that is that
>>>>>>>>>>>> we expected that
>>>>>>>>>>>> RDF* would also be used to make statements about triples as
>>>>>>>>>>>> they were
>>>>>>>>>>>> stated, for example to be able to explain the reasoning
>>>>>>>>>>>> performed on the
>>>>>>>>>>>> triples but also for simple provenance. In these cases there
>>>>>>>>>>>> should be a
>>>>>>>>>>>> difference between
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:decimal >> .
>>>>>>>>>>>>
>>>>>>>>>>>> and
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:integer >>
>>>>>>>>>>>>
>>>>>>>>>>>> since we still talk about different representations.
>>>>>>>>>>>>
>>>>>>>>>>>> Each triple is, in effect, its own context.  So, in an RDFS
>>>>>>>>>>>> version of RDF*,
>>>>>>>>>>>> even if :loisLane believes several triples that should imply
>>>>>>>>>>>> another, they
>>>>>>>>>>>> generally don't.  For example:
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent rdf:type :man >> .
>>>>>>>>>>>> :loisLane :believes << :man rdfs:subClassOf :human >> .
>>>>>>>>>>>>
>>>>>>>>>>>> Does not imply
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent rdf:type :human >> .
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> So embedded triples are incredibly weak in RDF*.   Making
>>>>>>>>>>>> them useful will
>>>>>>>>>>>> likely require quite a bit of work.
>>>>>>>>>>>>
>>>>>>>>>>>> Here, "useful" depends again on your intended use. We wanted
>>>>>>>>>>>> to have a
>>>>>>>>>>>> rather weak semantics which allows users with more complex
>>>>>>>>>>>> use cases to add
>>>>>>>>>>>> their semantics. It is easier to make the semantics more
>>>>>>>>>>>> complex by adding
>>>>>>>>>>>> extensions than to ignore certain parts. I for example
>>>>>>>>>>>> remember that Jos De
>>>>>>>>>>>> Roo announced some time ago that his EYE reasoner supports
>>>>>>>>>>>> rules on RDF*. Of
>>>>>>>>>>>> course that alone would not allow you to cover all cases,
>>>>>>>>>>>> but it could be
>>>>>>>>>>>> very helpful in practice.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On the other hand, there are some unusual inferences that
>>>>>>>>>>>> can be made in
>>>>>>>>>>>> RDF*.  In an RDF* version of RDFS++ it is possible to state
>>>>>>>>>>>> that two triples
>>>>>>>>>>>> are the same.   The graph
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :superman :can :fly >>.
>>>>>>>>>>>> << :superman :can :fly >> owl:sameAs << :clarkKent :can :fly
>>>>>>>>>>>>>> .
>>>>>>>>>>>> is consistent here and implies
>>>>>>>>>>>>
>>>>>>>>>>>> :superman owl:sameAs :clarkKent .
>>>>>>>>>>>> :loisLane :believes << :clarkKent :can :fly >>.
>>>>>>>>>>>>
>>>>>>>>>>>> This last case is an interesting one. We indeed wanted the
>>>>>>>>>>>> triple
>>>>>>>>>>>>
>>>>>>>>>>>> :loisLane :believes << :clarkKent :can :fly >>.
>>>>>>>>>>>>
>>>>>>>>>>>> to be a consequence of your statements. The question is whether
>>>>>>>>>>>>
>>>>>>>>>>>> :superman owl:sameAs :clarkKent .
>>>>>>>>>>>>
>>>>>>>>>>>> should follow (it does indeed follow, just as you describe).
>>>>>>>>>>>> We made the
>>>>>>>>>>>> semantics of embedded triples the way it is to be able to
>>>>>>>>>>>> deal with blank
>>>>>>>>>>>> notes. Here, I can't give a concrete answer whether (at
>>>>>>>>>>>> least to my
>>>>>>>>>>>> understanding) it should be that way. I will think about it
>>>>>>>>>>>> (and read
>>>>>>>>>>>> Pierre-Antoine's thoughts in the mean time, which just
>>>>>>>>>>>> arrived as well) and
>>>>>>>>>>>> come back to you.
>>>>>>>>>>>>
>>>>>>>>>>>> Kind regards,
>>>>>>>>>>>> Doerthe
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>

Received on Saturday, 24 October 2020 01:26:36 UTC