Re: RDF* semantics from thomas lörtsch on 2019-09-02 (public-rdf-star@w3.org from September 2019)

From: thomas lörtsch <tl@rat.io>
Date: Mon, 2 Sep 2019 13:01:10 +0200
To: Pierre-Antoine Champin <pierre-antoine.champin@univ-lyon1.fr>
Cc: "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Message-Id: <0814BD24-5518-4862-B41E-46D13D431915@rat.io>
> On 2. Sep 2019, at 11:15, Pierre-Antoine Champin <pierre-antoine.champin@univ-lyon1.fr> wrote:
> 
> Dear Thomas,
> 
> I have to strongly disagree with you.
> 
> On Fri, 30 Aug 2019 at 18:16, thomas lörtsch <tl@rat.io> wrote:
>> We are on the Semantic Web here which in general is not aiming at such subtleties. The semantics of identification in RDF are vague at best. Does https://paris.com refer to the website or the city?
> 
> The semantics of identification is *not* vague. It is precisely defined in the RDF Semantics [1] specification. You can also refer to the RDF Concepts [2] spec, section 1.3. It clearly talks about "the resource denoted by an IRI" (note the *singular* use of "resource").

But you have heard of "social meaning", right?

> I'm not saying that it is always easy to decide what exactly an IRI identifies. Especially with HTTP IRIs, which intuitively identify web resources (i.e. digital objects) rather than persons or city. There has been a long controversy [3] in the community about that, but it has been settled. For example, http://dbpedia.org/resource/Lyon identifies a city, while http://dbpedia.org/page/Lyon identifies the HTML document describing that city.

I wasn’t aware of that settlement. I don’t think it ends the debate about how to disambiguate denotation from indication on the semantic web in general. Note also that further scrutiny (or zealotry?) could demand to disambiguate the web page from the web site, the historic city center from some modern times administrative area etc etc. Note also that the other examples I gave are still unresolved. So, the semantics of RDF are vague in contrast to the subtle differentiations that natural language can express and they’ll always be. It’s a formalism - that brings abstractions with it, and abstractions cut off detail and sublime differentiations. That’s the whole point of it. Re-introducing those subtleties has a high risk of being arbitrary and rather limit exchange of information than facilitate it. Introducing limits may of course be useful but in the case of blank nodes I don’t think it is. IMO they should be handled as liberal as possible (of course some *added* on-demand disambiguation notwithstanding which is what I’m working on).

> However, in the example at hand, there is no ambiguity:
> 
> _:b1 rdf:type :Person.
> _:b1 :name "Alice".
> _:b1 :asserts _:b2.
> _:b2 rdf:type :Person.
> _:b2 :name "Bob".
> 
> the 4th triples says it clearly: _:b2 identifies something of type Person. If we agree that Persons and Statements are different kinds of animals, then we can not possibly interpret this graph as "Alice assertin some statement about…".

Well, I think we could interpret the identification semantics of_:b2 as either denoting some statements or as indicating what those statements describe (or of course its own shere existance but let’s try to not get into any more ratholes). Like any other URI that has not been disambiguated through the dbpedia settlement mechanism that you refer to above this one relies on "social meaning" and out of band agreement, maybe through convention or intuition about "naturalness" or specified by the range of the property used.

In the example at hand I read:
"Alice asserts that a person named Bob exists"
or, to avoid the discussion about ’that’ that I had with Olaf:
"Alice asserts: a person named Bob exists"
or, to avoid the connotation of serious logic:
"Alice asserts a person Bob".
That third example however doesn’t sound right in english. There is a msimatch between english and RDF - which is rather unsurprising as formalisms are languages of their own and tend to have their own subtle ways.
Now, what do *you* want to speak about - the person Bob or Alices statement? Both is possible. 

> Later you wrote:
>> This needs disambiguation even in plain english not to speak of RDF. RDF is not meant and can’t reasonably be expected to express such subtleties.
> 
> The primary goal of *formal* languages is precisely to be less ambiguous than natural languages. So yes, RDF should be expected to carry less ambiguity than english prose.

Isn’t this a rather naive take on the nature of formalisms? I guess we are all aware of the high hopes in logic - Leibnitz' "Calculemus!" - and the modest results so far. So what should or might be expected is not always what we get. RDF however is not even aiming to be AI, it is a toolset that e.g. relies on shared understanding about the meaning of the vocabulary used. It’s a pity that one of its basic constructs, the blank node, has led to so much confusion and discontent however it is also not really surprising. RDF tries to simplify things to make them amenable to ubiquituous use at unprecedented scale. With such a design goal some details get inevitably lost in translation. 
We could (try to) enforce the very strict semantics that you and Olaf demand but I don’t think that’s a good idea. This is a Hydra that grows 2 new heads for every one you cut off. If the issue is not pressing better leave it alone.  I rather advocate an approach that allows to express such specifics on demand, like: if I need to make sure that some blank node is understood in a certain way I’d like to have some syntactic sugar and vocabulary at my disposable to do so. 

>> If I wasn’t too lazy I would now check the RDF specifications as I’m sure there’s citable proof that blank nodes in RDF are not a means to speak about the shere existance of certain things
>> 
> RDF Semantics [1],  section 5.1, first sentence:  "Blank nodes are treated as simply indicating the existence of a thing".
> You couldn't have chose better words -- except for "not", obviously ;-)

No, because you are missing the most important word in that sentence: "simply" :-) This precisely tries to convey the intuition that one shouldn’t interpret too much into blank nodes. RDF is much to simple, simplistic even, to carry the meatphysical load that you and Olaf want it to carry. Blank nodes are used as existential quantifiers with all the logical consequences that come with it but they are also used as mere anchor points to collect a few properties of something that is not meant to get its own URI. A good example is
 :Bob :hasAdress [ :street "abc-street 1"; :city "Hometown"; :zip 12345 .]
Usually nobody would want to get philosophical about that blank node and the sort of existential questions that can arise from it. In other situations they are used exactly for the purpose of discussing existential questions of existance. It depends. I’d say Postels law - "Be liberal in what you accept…" - applies here in slightly modified form: "Be liberal in what you expect a blank node to express". Or, in other words: curb your enthusiasm ;-)

Olaf’s initial critique of Kingsley’s proposal already showed the difficulties that arise when we interpret too much in RDF structures: he interpreted the blank node in the object position different from the blank node in the subject position of Kinglsey’s example. It only got worse from there when he drew a conclusion about the range of :claims - the slightly bizarre ownership interpretation - from the syntactic structure of the object alone. This is precisely what I critcize: RDF doesn’t support such subtleties. It has blank nodes, reification of statement types (but not tokens) and in a hackish way also graphs. How much expressivity can you expect from that on its own? Of course you can model everything with the help of suitable classes and properties but in and on itself RDF is very limited and barely express some structure in the sea of triples. Relying on that structure to transport a lot of meaning leads into dangerous territories of - and that is an empirical fact that you can gather from countless threads like this one on semantic web related mailing lists - undissolvable disputations about the semantics of this, that and "but no I meant *that*". That's the exact opposite of easy interoperability that the Semantic Web is meant to enable.  

Thomas

>   pa
> 
> [1] https://www.w3.org/TR/rdf11-mt/
> [2] https://www.w3.org/TR/rdf11-concepts/
> [3] https://en.wikipedia.org/wiki/HTTPRange-14
>  
> Does a graph name name or label the graph? Is a statement reification referring to a type of statement or a statement token of that type? Cowpath defaults and "social meaning" rule at every level of identification in RDF and in general I think that in its alternativelessness this is a big problem and needs to be solved. However I think it needs to be solved through defaults and on demand disambiguation - late binding if you will - , not through arbitrary descents into semantic rat holes and tedious disambiguation when it isn't needed. 
> The default mode of the Semantic Web is straightforward exchange of facts like "there's a person named Bob of age 23 and alice, also a person, claims something about him". A blank node is nothing more than a structural helper, the semantic equivalent of a throw away plastic bag (I know we don't do that anymore although we have conferences at the end of the world, 11800 kg of CO2 from northern germany by plane). "There exists..." yadayada. The most intelligent thing you can teach your dumb machine to do is treat these facts at face value. Further subtleties may be encoded in the vocabulary: some specific Alice vocabulary may define alice:claims to have domain slaveholder and range slave as Olaf suggested. 
> If you want to speak about the fact as it has been stated (provenance etc) you need to reify it. Concise syntax and appropriate semantics would help as I like to keep repeating. To RDF the person and the fact are both just subjects of discourse, resources referenced by URIs. To RDF there is no difference here. Which has the fine property of making endlessly nested meta-meta-meta-... constructs possible - which is probably the least we would need to model this conversation in RDF. 
> But maybe you are not even suggesting to distinguish indication and denotation but want to speak about something like the factuality of that fact, no matter if or by whom it was stated? I wouldn't be able to follow you there - to dangerous... ;-) 
> 
> >Second, RDF semantics is defined under the Open World Assumption:
> >whatever
> >you know about a given node, you have to assume that there *may* be
> >other
> >triples about that node that you are not aware of. So by the "sum of
> >its
> >attributes", do you mean "everything that is true about the node,
> >whether
> >you know it or not" (which would be consistent with the OWA), or
> >"everything that is stated in a given graph" (which would seem more
> >appropriate for representing a give claim by Alice)?
> 
> 
> That's an orthogonal question. Named Graphs could help if their semantics are properly specified (easy, just define an appropriate vocabulary, and use the RDF extension mechanism to define a suitable semantics). 
> 
> 
> Thomas
> 
> 
> >  pa
> >
> >On Fri, 30 Aug 2019 at 11:46, thomas lörtsch <tl@rat.io> wrote:
> >
> >>
> >>
> >> > On 30. Aug 2019, at 10:29, Olaf Hartig <olaf.hartig@liu.se> wrote:
> >> >
> >> > Hi Thomas,
> >> >
> >> > On torsdag 29 augusti 2019 kl. 10:18:48 CEST thomas lörtsch wrote:
> >> >> [...]
> >> >> Ah, you are really taking all those little ’that’ words very
> >serious ;-)
> >> >
> >> > I better do; we are talking about semantics here ;-)
> >> >
> >> >> [...] your translation, "a person Bob who is of age 23", captures
> >the
> >> sense
> >> >> of factualness even better.
> >> >
> >> > Good.
> >> >
> >> >>> Therefore, all the triples together seem to say that a person
> >named
> >> >>> Alice claims a person named Bob who is of age 23. My initial
> >example
> >> >>> said something else, namely: person Alice claims *that* person
> >Bob is
> >> of
> >> >>> age 23.
> >> >>
> >> >> Hmm, that *that* again ;-) So you mean the difference between
> >Alice
> >> claiming
> >> >> that there exists a "Bob, person, aged 23" and Alice claiming that
> >some
> >> >> already introduced and agreed upon person Bob is "aged 23"?
> >> >
> >> > While the fact that the person Bob has already been introduced and
> >> agreed upon
> >> > is necessary to make single-statement claims about this person,
> >this is
> >> > secondary to the main point I keep on trying to make. Again, in my
> >> opinion,
> >> > Kingsley's data cannot be interpreted as you do in your sentence
> >above
> >> (person
> >> > Alice claims "that there exists" a person Bob of age 23). In
> >contrast,
> >> since
> >> > bnode _:b2 represents 'a person Bob of age 23', the :claims triple
> >with
> >> _:b2
> >> > in the object position is to be interpreted as: person Alice claims
> >the
> >> person
> >> > Bob (rather than claiming the existence of such a person). Hence,
> >the
> >> verb
> >> > "claim" here is used with its meaning of demanding ownership
> >instead of
> >> its
> >> > meaning of stating (potentially false) facts. See:
> >> >
> >> > https://en.wiktionary.org/wiki/claim#Verb
> >>
> >> Well, so "claims" has more meanings then that usually assumed in the
> >> context of semantic web discussions about reification, provenance
> >etc.
> >> However: either Alice claims the existence or ownership of "a person
> >Bob
> >> who is of age 23". I don’t see what difference this makes with
> >respect to
> >> our discussion about RDF* etc. Anyway I wouldn't want the semantics
> >of some
> >> property to have such wide ranging consequences on the meaning of
> >basic
> >> structural constructs like a blank node (… amybe a bit too bold a
> >statement
> >> - I hope there aren’t any non-marginal counter examples proofing me
> >wrong).
> >>
> >> > If you would only want to capture that person Alice claims "that
> >there
> >> exists"
> >> > a person Bob of age 23, then the object of the :claims triple
> >cannot be
> >> the
> >> > bnode _:b2, but instead the object needs to be a graph that
> >contains the
> >> three
> >> > triples that have bnode _:b2 in their subject position.
> >>
> >> That’s quite strong as a requirement. As I said before: what else
> >could a
> >> blank node possibly mean then the sum of its attributes? Can you give
> >a
> >> convincing answer to that question? And with convincing I mean
> >"obvious",
> >> "intuitive", "in wide use". One might want to talk about the blank
> >node
> >> *itself* but that is really a corner case and there are much wider
> >gaps in
> >> the identification semantics of the Semantic Web that would need a
> >fix
> >> first.
> >> I think the other way round - you have to be specific if you want to
> >> address the triple, otherwise you address all that’s said about the
> >blank
> >> node - is practicable and unsurprising. We have to find idioms that
> >are
> >> easy to use and have intuitive defaults. There is never an end to
> >even more
> >> precision but that doesn’t scale.
> >> What I would endorse however is rather an 80/20 style approach like a
> >> specific property to talk about the blank node itself - sensible
> >defaults,
> >> specific instruments where required. Disambiguating identification is
> >also
> >> a case by case problem: identifiers play different roles in different
> >> situations. Concise statement attribution could make it feasible to
> >> disambiguate those roles when necessary. That would be great.
> >>
> >>
> >> Thomas
> >>
> >>
> >>
> >> >> Technically that is the difference between talking about a set
> >> >> of triples with the same subject (lines 4-6 in the above example)
> >and a
> >> >> single triple (line 6), right?
> >> >
> >> > Almost. See above.
> >> >
> >> > Best,
> >> > Olaf
> >> >
> >> >
> >> >>>> [...]
> >> >>>> However I would also like to stress that such modelling is not
> >> >>>> meta-modelling and it is not equivalent to a layer of
> >abstraction
> >> >>>> (vulgo taking one step back) like reification or named graphs.
> >> >>>
> >> >>> Exactly! That's the point I am trying to make with this example.
> >To
> >> >>> capture the statement that "Alice claims *that* Bob is of age
> >23," we
> >> >>> need a form of meta-modeling.
> >> >>
> >> >> And I just wanted to express my endorsement of your position in
> >that
> >> >> respect.
> >> >>>> [...]
> >> >>>> Well, as I’m on it, a shameless plug: I recently posted an
> >unhaelthily
> >> >>>> long mail to this list . That mail started with [...] I wonder
> >if
> >> anybody
> >> >>>> bothered to read that sermon.
> >> >>>
> >> >>> I did ;-)
> >> >>
> >> >> Great! :-)
> >> >>
> >> >>> ...and I was planning to respond to it. However, since I am on
> >this
> >> list
> >> >>> here in my spare time, I couldn't get to it right away.
> >> >>
> >> >> No pressure! ;-)
> >> >>
> >> >> Thomas
> >> >>
> >> >>> Olaf
> >> >
> >> >
> >>
> >>
> >>
Received on Monday, 2 September 2019 11:01:40 UTC