Re: RDF Semantics - Intuitive summary needs to be scoped to interpretations (ISSUE-149) from David Booth on 2013-10-20 (www-archive@w3.org from October 2013)

From: David Booth <david@dbooth.org>
Date: Sat, 19 Oct 2013 23:56:29 -0400
To: Pat Hayes <phayes@ihmc.us>
CC: Antoine Zimmermann <antoine.zimmermann@emse.fr>, www-archive <www-archive@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Ivan Herman <ivan@w3.org>, Sandro Hawke <sandro@w3.org>
Message-ID: <5263546D.5040207@dbooth.org>
Hi Pat,

On 10/10/2013 02:05 AM, Pat Hayes wrote:
> [ . . . ]
> But, as I say, I now think that this idea, of trying to connect the
> formal notions to an intuition, was probably a mistake in this
> document and went against the spirit of a WG decision.

I don't know what was the WG decision, but formal specifications of any 
significant size almost always benefit from helpful informal guidance 
that gives insight about how they are intended to work.  This is 
analogous to the role of good comments in code when writing software. 
But I'm okay with it being deleted if you want.

>
> Pat
>
> <<Other in-line responses, below, are part of our continuing, um,
> debate, and are aside from discussions of the RDF documents.>>
>
>>> On Oct 4, 2013, at 10:51 AM, Peter Patel-Schneider wrote:
>>>
>>>> In my opinion the divergence boils down to Pat believing that
>>>> this informative section should be more informal and David
>>>> believing that it has to be more formal.
>>
>> I don't exactly think it has to be more formal, but just that: (a)
>> it needs to mention interpretations, because that concept is so
>> central to the formal semantics; and (b) the statement about the
>> conditions under which a graph is true *needs* to be scoped to an
>> interpretation to make any sense at all.
>
> That is exactly what it should *not* be, in order to convey the point
> it was intended to be conveying.

The point to which you allude appears to reflect a particular intuition, 
but apparently I don't agree that that is the only valid intuition that 
is supported by the mathematics.  More on this below.

>
>> If one talks about a graph being true, without mentioning an
>> interpretation, IMO the most sensible way to understand such a
>> statement is to take it as meaning that the graph is *satisfiable*
>
> No, that is not the right way to understand it. Truth and
> satisfiability are not the same thing at all. (That pigs can fly, is
> satisfiable.) To say that a graph (or any other assertion or
> sentence) is true, is to say that when it is interpreted *in the
> actual world*, its truth-value is true.

There are multiple problems I have with that last sentence.  First of 
all, AFAICT the formal semantics makes no claim whatsoever about the 
real world: the semantics leaves it up to the user to choose 
interpretations.  Second, the phrase "*the* actual world" betrays an 
assumption that there exists only *one* valid interpretation -- that 
"single-interpretation assumption", as I've been calling it -- whereas 
AFAICT the formal semantics makes no such assumption.

> That is the pre-theoretic,
> intuitive, notion. Someone says something, you figure out *what* they
> are saying, and you judge whether it - what they are saying - is
> true. Nothing in that account mentions interpretations. It does
> mention, implicitly, the truth conditions (section 5) and we could
> say that it *presumes* an interpretation that the speaker and hearer
> have in common.

Ah, now we're starting to get closer to the heart of the issue.  I'll 
come back to this below.

> And that is where the naivitée of this naive account
> is displayed, of course, that implicit assumption of a common
> interpretation; because when we have the kind of distancing between
> publisher and reader that is inevitable on the semantic web, and
> communicate using IRIs which have no assumed common background of
> linguistic meaning, we cannot presume this common shared
> interpretation, this "common ground"
> (http://semantics.uchicago.edu/kennedy/classes/f07/pragmatics/stalnaker02.pdf,
> or
> http://plato.stanford.edu/entries/discourse-representation-theory/.)

Agreed.

> So this is where the interpretation idea comes in, because we have
> to, as it were, survey the possible things you might mean when you
> publish some RDF. We don't know what world you are talking in, so we
> have to consider all *possible* worlds. Which is what interpretations
> are (the thin, pale shadows of formalizations of).

Yes, excellent so far.

>
> Long - very long - story short, the analysis of real linguistic
> communication - including Web communication - between cognitive
> agents (people, mostly) involves model-theoretic ideas, but it also
> involves a *lot* more. RDF, indeed the entire semantic web, is a tiny
> part of this larger picture, and can be fitted into it in one small
> corner. But in order to be useful, it does need to be fitted into it
> accurately.
>
>> : that there *exists* an interpretation under which the graph is
>> true, and hence we can take the graph as being true. (Conversely,
>> if the graph is not satisfiable then we cannot take it as being
>> true.)  OTOH, such a statement could be taken to mean that the
>> graph is true **in some unspecified interpretation**
>
> The one that is presumed when we talk (pre-theoretically) about what
> people are referring to when they say "Everest" (for example), and
> when we make judgements of the truth or otherwise of their utterances
> in the actual, real, world we are all talking about. Yes, exactly.

First of all, I really like this explicit distinction between the 
pre-theoretic or real world notion of truth, and the truth value that is 
assigned to an RDF graph by the formulas in the formal semantics.  That 
helps the discussion.

With that on the table, although real world truth should be a *goal* -- 
just as one resource per URI should be a goal -- I don't believe that it 
is the right criterion for making engineering decisions in the semantic 
web world.  Rather, *usefulness* is the more relevant criterion by which 
we should evaluate our engineering trade-offs when designing the 
semantic web.  This will take more explanation, so I'll attempt to 
provide that.  But with respect to interpretations, this translates into 
the notion that a more agnostic view toward interpretations should be 
taken, rather than making the single-interpretation assumption that 
always attempts to understand every RDF utterance in terms of a single 
notion of pre-theoretic real world truth.

Now to attempt to explain.  First of all, note that there is nothing 
whatsoever in the mathematics that limits us to a single interpretation: 
the mathematics works perfectly fine without modification whether we 
eventually talk about one or more than one interpretation.   I've 
pointed out several times that it is perfectly possible to have two 
interpretations I1 and I2 and two graphs G1 and G2, such that 
I1(G1)=true and I2(G2)=true (in the non-pre-theoretic sense), whether or 
not these graphs share some of the same URIs.  "So, what of it?" you may 
ask.  I'll get to that.

The second point to observe is that different graph authors have 
different interpretations in mind when they write their graphs.  This 
can be either conscious or unconscious.  Although I agree that there is 
a single notion of pre-theoretic truth in the real world, different 
people have different -- and sometimes *very* different -- ideas of what 
that single truth is.  Correspondingly, they also make different 
assumptions about the resource to which a given URI maps, within those 
interpretations.  Again you may object and assert that if they are 
making different assumptions then one or more of them should be 
considered wrong.  But again, as I've tried to point out, such a 
requirement is not generally *possible* to obey.

Asssuming that a URI owner has the right to say what resource his/her 
URI denotes (as described in the Web Architecture), there are several 
reasons why different well-intentioned URI users may make different 
assumptions about the identity of a URI's resource:

1. The URI owner may not know or may not understand a particular 
resource distinction that matters to some user of that URI.  We cannot 
expect every URI owner to be omniscient about his/her URI's resource.

2. The URI owner may not care about a particular distinction.  We cannot
expect the URI owner to have the same concerns as all users of the URI.

3. The URI owner may *intend* the URI definition to be ambiguous to some
degree, so that the URI can be used in a wider variety of ways.

4. The URI owner may not be reachable to clarify a particular point of 
ambiguity.

5. The URI owner may want to keep the resource definition simple,
without cluttering it up with distinctions that 99% of the
URI's target users would not care about.  Complexity has a cost.

6. The URI owner may not wish to expend the resources necessary
to figure out what finer distinctions might be made.

7. When a URI definition is provided in a machine processable form such 
as an RDF graph -- and that of course is the point of the Semantic Web 
-- it is generally not possible to make that definition unambiguous.

So the reality is that different authors *do* make different assumptions 
about the resource denoted by a particular URI.  This is very neatly 
captured by the notion that different authors have different sets of 
intended interpretations in mind when they write their graphs.  In other 
words, when an author writes an RDF graph, the author's intended meaning 
of that graph does *not* generally boil down to a *single* 
interpretation, but an ambiguous *set* of interpretations, all of which 
are licensed interpretations falling within the author's intent.

This leads to the question of what exactly are the author's intended 
interpretations for a given graph.  That of course may be hard to know 
-- just as it may be hard to know what *single* interpretation the 
author intended if one assumes that the author only intended a single 
interpretation.  But given that the author could (in principle at least) 
if desired supply whatever constraints he/she chooses as triples within 
the graph, a reasonable assumption is that the intended interpretations 
are the satisfying intepretations of the ontological closure of that 
graph.  (By ontological closure I mean the union of the graph with the 
transitive closure of the URI definitions for the URIs within the 
graph.)  This makes for a very "what you see is what you get" notion of 
the intended interpretations, and I will note that it has the further 
advantages of: (a) removing nearly all "then a miracle occurs" steps
http://blog.stackoverflow.com/wp-content/uploads/then-a-miracle-occurs-cartoon.png
in the determination of interpretations; and (b) being completely 
aligned with the intent of the Semantic Web of facilitating machine 
processing.

In other words, to my mind the notion of interpretations provided by the 
RDF Semantics aligns very well with: (a) the inescapable ambiguity of 
resource identity; and (b) the fact that people do *not* have the same 
view of the world, nor do their software applications have the same view 
of the world.

>
>> .  But that would be a very bad way to write
>
> Try telling that to linguists. Or to literary theorists, or
> historians, or philosophers of language, or indeed pretty much anyone
> who uses language professionally. Not only is this not a bad way to
> write, its the ONLY way to write if we are trying to anchor model
> theory in an intuitive description of how communication actually
> happens.

I can't comment on literary theory or such, but to my mind, in formal 
semantics, variables should *always* be bound.

> Except, calling the actual world "unspecified" seems a
> little strange.

Amusing.  :)  But I don't view the actual world as being very relevant 
to the semantics, perhaps because I have a different intuitive view of 
the semantics than you do, as I tried to explain above.

>
>> , because the interpretation under which the graph is true would be
>> an implicit unbound variable, which as we all know is a big no-no.
>
> It is implicit, yes, but I don't know what kind of assumptions you
> are appealing to by calling this a big no-no. Contexts are usually
> implicit, right?

Yes, but we try hard to make them explicit, especially in formal specs.

>
>> Instead, the problem can be easily solved by adding "under a given
>> interpretation" to the sentence.  (Of course, the notion of an
>> interpretation should first be explained.  But that is a different
>> omission that should be addressed anyway.)
>>
>> And regarding this:
>> http://lists.w3.org/Archives/Public/public-rdf-wg/2013Oct/0079.html
>>
>>
[[
>> I know, from extensive off-line email discussions with David, that
>> he does not properly understand the intuitive foundations of
>> semantics in any case, so I am not inclined to accept his rather
>> condescending advice. ]] (Wow, you're calling *me* condescending,
>> after repeatedly telling me to "go read a book"???)  That's both:
>> (a) quite a projection; and (b) *really* unfair and unhelpful.
>> Fortunately I'm thick skinned and I have a good sense of humor.
>> :)
>
> Well, you weren't meant to read that, obviously. But my dear fellow,
> *have* you read the books, in fact?

I've read what I could find on the web on model theory, but not books. 
The best resource I've found has been the Stanford Encyclopedia of 
Philosophy, which I like a lot:
http://plato.stanford.edu/
For example, here is their entry on model theory, which corresponds 
beautifully with your explanation:
http://plato.stanford.edu/entries/model-theory/
Incidentally, I wrote to the author of that particular article for some 
minor clarification, and he was quite nice and confirmed a particular 
point of understanding about interpretations.  If there are other 
references on the web that you'd suggest, I would certainly be 
interested in looking at them.  But thus far, all that I have read has 
confirmed the understanding that I initially got from your writings, 
which I've found most informative, BTW.

> Is it really condescending for me
> to suggest that you might want to read up something a little more
> extensive than a few paragraphs that I wrote about RDF,

I certainly have done so, and read it quite carefully too.

> before
> claiming that you have discovered a new way to understand model
> theory, or setting out to correct my misunderstanding of it,

NEVER have I made any such claim.

> or
> telling me that my perspective is too limited? I don't mean to pull
> rank on you here, but I have been studying this stuff now, as well as
> teaching it, for about 40 years. For a few years, I invented new
> model theories for a living. God knows there are a lot of things I
> don't fully understand, but model-theoretic semantics is one topic I
> really do have pretty thoroughly grokked.

Okay, stop right there.  Clearly you have grossly misunderstood my 
intent, as I have never once questioned your understanding of model 
theory, nor have I made any claims of discovering a new way to 
understand model theory or any other such grand claims.  All I have done 
is tried to point out that, **based on the mathematics given**, there is 
another valid way to think about the RDF Semantics.  Furthermore, AFAICT 
it is a *useful* way to think about the RDF Semantics, as it helps 
explain real world use of RDF in a way that is not explained under the 
single-interpretation assumption.  It is not in any way intended to 
extend model theory or make any grand claims about any new discoveries. 
  It is just a simple and straightforward way to use the semantic 
formulas defined by the RDF Semantics that perhaps uses a slightly 
different intuition of what they mean.  Formulas are formulas and can be 
viewed in different ways.  Although many people may think intuitively of 
E=MC^2 as meaning that matter can be converted to energy, it can also be 
just as well taken to mean that energy can be converted to matter.

I hope this helps to clarify my intent.

Thanks,
David
Received on Sunday, 20 October 2013 03:56:58 UTC