Re: RDF* semantics from Andy Seaborne on 2019-09-05 (public-rdf-star@w3.org from September 2019)

From: Andy Seaborne <andy@apache.org>
Date: Thu, 5 Sep 2019 10:16:13 +0100
To: thomas lörtsch <tl@rat.io>
Cc: public-rdf-star@w3.org
Message-ID: <d5cfa9d3-1998-c19e-bbda-3863a5b611ed@apache.org>
On 04/09/2019 13:53, thomas lörtsch wrote:
> 
> 
>> On 3. Sep 2019, at 21:46, Andy Seaborne <andy@apache.org> wrote:
>>
>>
>>
>> On 02/09/2019 10:15, Pierre-Antoine Champin wrote:
>>> Dear Thomas,
>>> I have to strongly disagree with you.
>>> On Fri, 30 Aug 2019 at 18:16, thomas lörtsch <tl@rat.io <mailto:tl@rat.io>> wrote:
>>>     We are on the Semantic Web here which in general is not aiming at
>>>     such subtleties. The semantics of identification in RDF are vague at
>>>     best. Does https://paris.com refer to the website or the city?
>>> The semantics of identification is *not* vague. It is precisely defined in the RDF Semantics <https://www.w3.org/TR/rdf11-mt/> [1] specification. You can also refer to the RDF Concepts <https://www.w3.org/TR/rdf11-concepts/> [2] spec, section 1.3 <https://www.w3.org/TR/rdf11-concepts/#referents>. It clearly talks about "the resource denoted by an IRI" (note the *singular* use of "resource").
>>> I'm not saying that it is always easy to decide what exactly an IRI identifies. Especially with HTTP IRIs, which intuitively identify web resources (i.e. digital objects) rather than persons or city. There has been a long controversy <https://en.wikipedia.org/wiki/HTTPRange-14> [3] in the community about that, but it has been settled. For example, http://dbpedia.org/resource/Lyon identifies a city, while http://dbpedia.org/page/Lyon <http://dbpedia.org/data/Lyon> identifies the HTML document describing that city.
>>> However, in the example at hand, there is no ambiguity:
>>> _:b1 rdf:type :Person.
>>> _:b1 :name "Alice".
>>> _:b1 :asserts _:b2.
>>> _:b2 rdf:type :Person.
>>> _:b2 :name "Bob".
>>> the 4th triples says it clearly: _:b2 identifies something of type Person.
>>
>> I'm glad you said this because some of this discussion seems to have looking at the graph as a whole only.  Even if update isn't directly defined, discovering data happens.
>>
>> <#b> rdf:type :Person.
>> <#b> :name "Bob".
>>
>> and then
>>
>> <#a> rdf:type :Person.
>> <#a> :name "Alice".
>>
>> Symmetric. <#a> and <#b> identify people.
>>
>> and then later
>>
>> <#a> :asserts <#b>
>>
>> so a data merge alters the meaning elsewhere? One feature (IMO advantage) of the RDF approach is that meaning does not get undone as more is discovered.
> 
> But a statement
> 
> <#a> :asserts <#b>
> 
> alone is meaningless 

The example is relevant if run backwards - starting with
<#a> :asserts <#b> (vs a an empty graph) and adding the <#a> and <#b> 
subject triples.

While the triple :asserts isn't much use on its own, it is not 
meaningless. (Many triples are like that :-)

In particular it does not change the role of <#b>.

> so you would need to look for further information anyway. The Open World Assumption can’t save you from contradictions introduced through further statements - nothing can. Your application will have to deal with that (maybe through storing contradicting triples in different named graphs)
> 
>> This relates to the idea of having to also check if a triple is valid or not by looking at its RDF* aspects. Queries get broken by a statement elsewhere added later. (see the Named Graph thread)
>>
>> What might be useful is that in PG edges can have key-value attributes, not edges-on-edges.
>>
>> In that framing, RDF* is a more ambitious endeavour and complex ideas don't scale easily on the web.
> 
> RDF* makes it easier to attribute statements. That is not necessarily more complex then modelling the same information with standard RDF techniques like blank nodes. Indeed many people find it more intuitive. So IMO it’s quite open if this idea scales (I have expressed a certain scepticism before, but still... and I’m learning interesting aspects from ideas put forward in this thread).

It makes some cases easier to write down.  Which cases depends on the 
mode - I haven't understood how modes works out. Is it choose one for 
RDF* or some what to indicate which mode applies? The latter is 
potentially problematic for databases; or they end up storing whatever 
and the application has to understand both modes, and that seems like 
failing to address the issue.

> 
> Thomas
> 
> 
>>     Andy
>>
>>
>>> If we agree that Persons and Statements are different kinds of animals, then we can not possibly interpret this graph as "Alice assertin some statement about...".
>>> Later you wrote:
>>>     This needs disambiguation even in plain english not to speak of RDF.
>>>     RDF is not meant and can’t reasonably be expected to express such
>>>     subtleties.
>>> The primary goal of *formal* languages is precisely to be less ambiguous than natural languages. So yes, RDF should be expected to carry less ambiguity than english prose.
>>>     If I wasn’t too lazy I would now check the RDF specifications as I’m
>>>     sure there’s citable proof that blank nodes in RDF are not a means
>>>     to speak about the shere existance of certain things
>>> RDF Semantics [1], section 5.1 <https://www.w3.org/TR/rdf11-mt/#blank-nodes>, first sentence:  "Blank nodes are treated as simply indicating the existence of a thing".
>>> You couldn't have chose better words -- except for "not", obviously ;-)
>>>    pa
>>> [1] https://www.w3.org/TR/rdf11-mt/
>>> [2] https://www.w3.org/TR/rdf11-concepts/
>>> [3] https://en.wikipedia.org/wiki/HTTPRange-14
>>>     Does a graph name name or label the graph? Is a statement
>>>     reification referring to a type of statement or a statement token of
>>>     that type? Cowpath defaults and "social meaning" rule at every level
>>>     of identification in RDF and in general I think that in its
>>>     alternativelessness this is a big problem and needs to be solved.
>>>     However I think it needs to be solved through defaults and on demand
>>>     disambiguation - late binding if you will - , not through arbitrary
>>>     descents into semantic rat holes and tedious disambiguation when it
>>>     isn't needed.
>>>     The default mode of the Semantic Web is straightforward exchange of
>>>     facts like "there's a person named Bob of age 23 and alice, also a
>>>     person, claims something about him". A blank node is nothing more
>>>     than a structural helper, the semantic equivalent of a throw away
>>>     plastic bag (I know we don't do that anymore although we have
>>>     conferences at the end of the world, 11800 kg of CO2 from northern
>>>     germany by plane). "There exists..." yadayada. The most intelligent
>>>     thing you can teach your dumb machine to do is treat these facts at
>>>     face value. Further subtleties may be encoded in the vocabulary:
>>>     some specific Alice vocabulary may define alice:claims to have
>>>     domain slaveholder and range slave as Olaf suggested.
>>>     If you want to speak about the fact as it has been stated
>>>     (provenance etc) you need to reify it. Concise syntax and
>>>     appropriate semantics would help as I like to keep repeating. To RDF
>>>     the person and the fact are both just subjects of discourse,
>>>     resources referenced by URIs. To RDF there is no difference here.
>>>     Which has the fine property of making endlessly nested
>>>     meta-meta-meta-... constructs possible - which is probably the least
>>>     we would need to model this conversation in RDF.
>>>     But maybe you are not even suggesting to distinguish indication and
>>>     denotation but want to speak about something like the factuality of
>>>     that fact, no matter if or by whom it was stated? I wouldn't be able
>>>     to follow you there - to dangerous... ;-)
>>>      >Second, RDF semantics is defined under the Open World Assumption:
>>>      >whatever
>>>      >you know about a given node, you have to assume that there *may* be
>>>      >other
>>>      >triples about that node that you are not aware of. So by the "sum of
>>>      >its
>>>      >attributes", do you mean "everything that is true about the node,
>>>      >whether
>>>      >you know it or not" (which would be consistent with the OWA), or
>>>      >"everything that is stated in a given graph" (which would seem more
>>>      >appropriate for representing a give claim by Alice)?
>>>     That's an orthogonal question. Named Graphs could help if their
>>>     semantics are properly specified (easy, just define an appropriate
>>>     vocabulary, and use the RDF extension mechanism to define a suitable
>>>     semantics).
>>>     Thomas
>>>      >  pa
>>>      >
>>>      >On Fri, 30 Aug 2019 at 11:46, thomas lörtsch <tl@rat.io
>>>     <mailto:tl@rat.io>> wrote:
>>>      >
>>>      >>
>>>      >>
>>>      >> > On 30. Aug 2019, at 10:29, Olaf Hartig <olaf.hartig@liu.se
>>>     <mailto:olaf.hartig@liu.se>> wrote:
>>>      >> >
>>>      >> > Hi Thomas,
>>>      >> >
>>>      >> > On torsdag 29 augusti 2019 kl. 10:18:48 CEST thomas lörtsch wrote:
>>>      >> >> [...]
>>>      >> >> Ah, you are really taking all those little ’that’ words very
>>>      >serious ;-)
>>>      >> >
>>>      >> > I better do; we are talking about semantics here ;-)
>>>      >> >
>>>      >> >> [...] your translation, "a person Bob who is of age 23", captures
>>>      >the
>>>      >> sense
>>>      >> >> of factualness even better.
>>>      >> >
>>>      >> > Good.
>>>      >> >
>>>      >> >>> Therefore, all the triples together seem to say that a person
>>>      >named
>>>      >> >>> Alice claims a person named Bob who is of age 23. My initial
>>>      >example
>>>      >> >>> said something else, namely: person Alice claims *that* person
>>>      >Bob is
>>>      >> of
>>>      >> >>> age 23.
>>>      >> >>
>>>      >> >> Hmm, that *that* again ;-) So you mean the difference between
>>>      >Alice
>>>      >> claiming
>>>      >> >> that there exists a "Bob, person, aged 23" and Alice claiming
>>>     that
>>>      >some
>>>      >> >> already introduced and agreed upon person Bob is "aged 23"?
>>>      >> >
>>>      >> > While the fact that the person Bob has already been introduced and
>>>      >> agreed upon
>>>      >> > is necessary to make single-statement claims about this person,
>>>      >this is
>>>      >> > secondary to the main point I keep on trying to make. Again, in my
>>>      >> opinion,
>>>      >> > Kingsley's data cannot be interpreted as you do in your sentence
>>>      >above
>>>      >> (person
>>>      >> > Alice claims "that there exists" a person Bob of age 23). In
>>>      >contrast,
>>>      >> since
>>>      >> > bnode _:b2 represents 'a person Bob of age 23', the :claims triple
>>>      >with
>>>      >> _:b2
>>>      >> > in the object position is to be interpreted as: person Alice
>>>     claims
>>>      >the
>>>      >> person
>>>      >> > Bob (rather than claiming the existence of such a person). Hence,
>>>      >the
>>>      >> verb
>>>      >> > "claim" here is used with its meaning of demanding ownership
>>>      >instead of
>>>      >> its
>>>      >> > meaning of stating (potentially false) facts. See:
>>>      >> >
>>>      >> > https://en.wiktionary.org/wiki/claim#Verb
>>>      >>
>>>      >> Well, so "claims" has more meanings then that usually assumed in the
>>>      >> context of semantic web discussions about reification, provenance
>>>      >etc.
>>>      >> However: either Alice claims the existence or ownership of "a person
>>>      >Bob
>>>      >> who is of age 23". I don’t see what difference this makes with
>>>      >respect to
>>>      >> our discussion about RDF* etc. Anyway I wouldn't want the semantics
>>>      >of some
>>>      >> property to have such wide ranging consequences on the meaning of
>>>      >basic
>>>      >> structural constructs like a blank node (… amybe a bit too bold a
>>>      >statement
>>>      >> - I hope there aren’t any non-marginal counter examples proofing me
>>>      >wrong).
>>>      >>
>>>      >> > If you would only want to capture that person Alice claims "that
>>>      >there
>>>      >> exists"
>>>      >> > a person Bob of age 23, then the object of the :claims triple
>>>      >cannot be
>>>      >> the
>>>      >> > bnode _:b2, but instead the object needs to be a graph that
>>>      >contains the
>>>      >> three
>>>      >> > triples that have bnode _:b2 in their subject position.
>>>      >>
>>>      >> That’s quite strong as a requirement. As I said before: what else
>>>      >could a
>>>      >> blank node possibly mean then the sum of its attributes? Can you
>>>     give
>>>      >a
>>>      >> convincing answer to that question? And with convincing I mean
>>>      >"obvious",
>>>      >> "intuitive", "in wide use". One might want to talk about the blank
>>>      >node
>>>      >> *itself* but that is really a corner case and there are much wider
>>>      >gaps in
>>>      >> the identification semantics of the Semantic Web that would need a
>>>      >fix
>>>      >> first.
>>>      >> I think the other way round - you have to be specific if you want to
>>>      >> address the triple, otherwise you address all that’s said about the
>>>      >blank
>>>      >> node - is practicable and unsurprising. We have to find idioms that
>>>      >are
>>>      >> easy to use and have intuitive defaults. There is never an end to
>>>      >even more
>>>      >> precision but that doesn’t scale.
>>>      >> What I would endorse however is rather an 80/20 style approach
>>>     like a
>>>      >> specific property to talk about the blank node itself - sensible
>>>      >defaults,
>>>      >> specific instruments where required. Disambiguating
>>>     identification is
>>>      >also
>>>      >> a case by case problem: identifiers play different roles in
>>>     different
>>>      >> situations. Concise statement attribution could make it feasible to
>>>      >> disambiguate those roles when necessary. That would be great.
>>>      >>
>>>      >>
>>>      >> Thomas
>>>      >>
>>>      >>
>>>      >>
>>>      >> >> Technically that is the difference between talking about a set
>>>      >> >> of triples with the same subject (lines 4-6 in the above example)
>>>      >and a
>>>      >> >> single triple (line 6), right?
>>>      >> >
>>>      >> > Almost. See above.
>>>      >> >
>>>      >> > Best,
>>>      >> > Olaf
>>>      >> >
>>>      >> >
>>>      >> >>>> [...]
>>>      >> >>>> However I would also like to stress that such modelling is not
>>>      >> >>>> meta-modelling and it is not equivalent to a layer of
>>>      >abstraction
>>>      >> >>>> (vulgo taking one step back) like reification or named graphs.
>>>      >> >>>
>>>      >> >>> Exactly! That's the point I am trying to make with this example.
>>>      >To
>>>      >> >>> capture the statement that "Alice claims *that* Bob is of age
>>>      >23," we
>>>      >> >>> need a form of meta-modeling.
>>>      >> >>
>>>      >> >> And I just wanted to express my endorsement of your position in
>>>      >that
>>>      >> >> respect.
>>>      >> >>>> [...]
>>>      >> >>>> Well, as I’m on it, a shameless plug: I recently posted an
>>>      >unhaelthily
>>>      >> >>>> long mail to this list . That mail started with [...] I wonder
>>>      >if
>>>      >> anybody
>>>      >> >>>> bothered to read that sermon.
>>>      >> >>>
>>>      >> >>> I did ;-)
>>>      >> >>
>>>      >> >> Great! :-)
>>>      >> >>
>>>      >> >>> ...and I was planning to respond to it. However, since I am on
>>>      >this
>>>      >> list
>>>      >> >>> here in my spare time, I couldn't get to it right away.
>>>      >> >>
>>>      >> >> No pressure! ;-)
>>>      >> >>
>>>      >> >> Thomas
>>>      >> >>
>>>      >> >>> Olaf
>>>      >> >
>>>      >> >
>>>      >>
>>>      >>
>>>      >>
>>
>
Received on Thursday, 5 September 2019 09:16:39 UTC