Re: A comment on issue-148 from David Booth on 2013-12-13 (public-rdf-comments@w3.org from December 2013)

From: David Booth <david@dbooth.org>
Date: Fri, 13 Dec 2013 14:04:04 -0500
To: public-rdf-comments <public-rdf-comments@w3.org>
CC: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Message-ID: <52AB5A24.6000907@dbooth.org>
Hi Antoine,

I just noticed your message below.  Since this is
about an issue that I raised, I wanted to respond.

First of all, thanks very much for your thoughtful analysis!
I think it is astute and mostly correct, though I disagree
with some points, which I'll explain below.  I also very much
appreciate your explanations, as I think some of them have
been better than the ones I've managed.

http://lists.w3.org/Archives/Public/public-rdf-wg/2013Oct/0267.html
 > From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
 > Date: Fri, 25 Oct 2013 10:03:26 +0200
 > To: public-rdf-wg@w3.org
 > CC: Pat Hayes <phayes@ihmc.us>
 >
 > Let me feed the debate with my couple of € cents.
 >
 >
 > The debate is on this sentence: "IRIs have global scope:
 > Two different appearances of an IRI denote the same resource."
 >
 > There are various ways of interpreting this, but one may be
 > tempted to say that it is equivalent to: "A given IRI *always*
 > denotes the same thing."
 >
 > Of course, when formulated in this way, it is more subject
 > to arguments.  The word "always" suggests that we are talking
 > about time and changes over time. If this is what we mean by
 > "always", then it is clear that the formal semantics does
 > not give any credit to such claim, as much as it does not
 > pretend it is false. RDF Semantics simply do not say anything
 > about time and changes.

Agreed, but that is not the issue that I am raising.

 >
 > But "always" may mean that what an IRI denote is independent
 > of where it appears. That is, what an IRI denote does not
 > depend on whether it is in subject, predicate or object
 > position of any triple in any graph. This is how global
 > scope should be interpreted. And with this notion of scope,
 > the spec is correct, no matter what David Booth says.

But that is exactly the case in which the Concepts spec is
*wrong*.  I agree that it does not depend on whether the IRI
is in subject, predicate or object position.  But it *does*
depend on the *interpretation*, and this in turn depends on
the data's authorship and use.  Different graphs are often
interpreted differently, and sometimes even the same graph
is interpreted differently by different users, as Ian Davis's
toucan example illustrated.  The reality is that the same IRI
does *not* always denote the same resource, as much as we
would like it to.  It only does so in *one* interpretation
at a time.  It is generally reasonable to think in terms of
one interpretation at a time when dealing with one graph from
one author being used for one purpose.  But it is folly to do
so on the scale and diversity of the world wide web, in which
graphs are written and used with widely differing intents and
differing notions of reality.  That kind of thinking leads
people to naively assume that they can merge any two "true"
RDF graphs and have the result work in their application.
But they can't, because those graphs may be true under
*different* interpretations.

I imagine Pat Hayes would claim that I am talking about a
different, more colloquial notion of *identifying* rather
than the formal notion of *denoting*.  But in fact I *am*
talking about the formal notion of *denoting*, because the
semantics can be applied *separately* -- using two different
interpretations -- to two different graphs, just as the
semantics of a programming language can be applied separately
to two different programs that are written in that language.
For example, one can consider both the truth value of I1(G1)
and I2(G2), where I1 and I2 are different interpretations and
G1 and G2 are different graphs.

The issue regards this statement:

   "IRIs have global scope: Two different appearances of an
   IRI denote the same resource."

Here is a simple proof that the above statement is false:
http://lists.w3.org/Archives/Public/www-archive/2013Dec/0006.html

 >
 > However, there is another interpretation of "always" here:
 > it can be understand as "there is no possible situation
 > under which an IRI could denote a thing, while it denotes
 > another thing under a different situation". In this case,
 > the statement is false. What an IRI denote is subject to
 > interpretation, and therefore, when you change perspective,
 > what an IRI denotes may change. But this is taking the notion
 > of scope too far.
 >
 > If I compare this to programming languages, in which it is
 > often possible to define variables with global scope (a.k.a.,
 > global variables), the objection of David Booth would imply
 > that global variables do not have global scopes. Indeed, two
 > executions of the same programme would assign the "global"
 > variables to different areas of memory. A programme that does
 > not use the package having the global variable could define
 > the same variable as local or make it global with different
 > values. Surely, what the variable refers to depends on the
 > context of execution.

Excellent comparison!  Notice that in a programming language
the natural top-unit for semantic analysis and specification
is the *program*.  In RDF it is the *graph*.

Certainly I agree that *within* an RDF graph, an IRI's scope
is "global", just as a variable may be "global" *within* a
particular program.  But that is *not* how the word "global"
is likely to be interpreted by readers when the spec says:

   "IRIs have global scope: Two different appearances of an
   IRI denote the same resource."

In that sentence, I doubt very much that readers will
interpret the phrase "global scope" to mean "global *within*
an RDF graph".  That phrase is very likely to be understood to
mean "global across *all* RDF graphs".  And just as a "global"
variable in a programming language is *not* global across *all*
programs, neither is an IRI global across *all* graphs.

To illustrate, suppose you view two separate programs, one on
site A and one on site B, and they both use "global" variable X.
Clearly those two programs are each referring to a *different*
variable X, even though the semantics calls it global.

However, you could perfectly well download both of those
programs, *merge* them to become one program, and *then*
(in the context of that larger program) they would indeed
both refer to the *same* global variable.  But until those
programs are in fact merged, and the semantics of this new
program are then evaluated, global variable X in one program
is *not* the same as global variable X in the other program.

The exact same thing is true of RDF.  A graph on one site
written by one author may use an IRI U, and a graph on another
site written by another author may also use IRI U.  But those
graphs may well be intended to use *different* interpretations
that map U to *different* resources.  *Within* each of those
graphs it is normally reasonable to assume that each occurrence
of U refers to the same resource, and the RDF Semantics rules
treat them IRIs way.   But it would be naive to make that
assumption *across* the two graphs.  We wish we could make that
assumption, and indeed that *is* the goal behind the AWWW, as
stated in section 2.2 where it says: "By design a URI identifies
one resource."  http://www.w3.org/TR/webarch/#p41 But it is
*not* the reality, as the AWWW explicitly acknowledges when it
recommends to "Assign distinct URIs to distinct resources",
http://www.w3.org/TR/webarch/#p46 and it admonishes against
"URI collision": http://www.w3.org/TR/webarch/#URI-collision
And it is not the reality with respect to the RDF Semantics,
which explicitly acknowledges the existence of multiple
interpretations and allows the truth value of a graph to be
determined with respect to *any* interpretation.

Of course, when graphs are *unioned*, the formal semantic
rules make the *assumption* that a IRI U everywhere denotes
the same thing in both graphs -- i.e., only one interpretation
is passed around in the semantic rules.  The rules provide no
context machinery for combining two graphs under different
interpretations.  And if those graphs had in fact been
interpreted under different interpretations then the union
may not work as the user expected.

This is where the analogy with a programming language breaks
down a bit, because in discussing the idea of a global
variable in a programming language, users only *expect* a
global variable to be global within the scope of a particular
program -- not the whole world.  They do not normally expect
a global variable to be global across different programs.

In contrast, RDF was specifically designed to be used on
the World Wide Web, such that applications can easily and
meaningfully *combine* RDF graphs obtained from many sources
around the Web.  Thus, in the RDF context, the unqualified
statement that "IRIs have global scope" is extremely misleading.
It misleads readers into naively thinking that they will be
able to successfully combine arbitrary RDF graphs that are all
individually true, but true under *different* interpretations.

To complete the programming language analogy, just as global
variables really only have global scope **within a given
program**, in RDF IRIs really only have global scope **within
a given graph**.  And in fact, if you look at the formulas
in the RDF Semantics you will see that this is exactly the
constraint that the formulas in the RDF Semantic impose.

Thus, if the text in the RDF Concepts were changed to the
following it would be accurate:

   "Within each RDF graph, IRIs have global scope: Two different
   appearances of an IRI denote the same resource."

Alternatively, if it were changed to the following it would
also be accurate:

   "Within one interpretation, IRIs have global scope: Two
   different appearances of an IRI denote the same resource."

But as is, the text is misleading and false.

 >
 > But in a specification, there is no reason to extend scope
 > outside the borders of a single system (even though the
 > system is distributed and open) or outside the borders of
 > a single perspective, or context.

But what is the natural top-unit of context or perspective
for RDF?  Surely it is the graph, just as it is the program
in the programming languages analogy.

 >
 > The notion of "scope" that David Booth is using (to
 > justify that a single IRI can denote several things) is
 > trans-perspective, or trans-context.  In RDF Concepts,
 > the definitions are given assuming one perspective.

No, they emphatically are *not*.  The RDF Concepts frequently
talks about different graphs, whereas the formulas in the RDF
Semantics typically talk about *a* graphL

 > This is *not* in contradiction with RDF Semantics, even if the
 > formal semantics defines an infinite set of interpretations,
 > with infinite ways of denoting.  The set of interpretations
 > has to be defined because a system does not know what is the
 > one perspective that has to be assumed when processing RDF.
 > But the formalism makes it clear at least what perspectives
 > are plausible and which are impossible. Ideally, the RDF
 > graph is sufficiently detailed that there is only one possible
 > interpretation of the data.
 >
 > Now the situation can be made more complicated by the fact
 > that there are many cases when one wants to reason about
 > several perspectives at the same time.  One may want to reason
 > across contexts.  This is fair enough, but it is out of the
 > scope of RDF Semantics.  It is, however, within the possible
 > scope of RDF Dataset semantics, but this is another story.
 >
 > To continue with the programming comparison, one can write
 > meta-programmes that are analysing programmes and their
 > variables. In this case, for the meta-programme, each global
 > variables from the programmes become local. Similarly,
 > when David Booth assumes IRIs can denote several things,
 > he is thinking at a meta-level

Exactly.

 > that RDF Concepts does not have to describe.

I vehemently disagree, for reasons explained above.
User expectations are completely different for programs
than they are for RDF graphs.  People *expect* to be able to
combine RDF graphs from around the Web, and RDF was designed
with this explicit goal in mind.  But the reality is that
graphs cannot be so easily combined (at least not with the
result the users expect), because different graphs are often
interpreted differently.

 >
 > AZ.
 >
 > Le 24/10/2013 20:28, Pat Hayes a écrit :
 > > (Just to get this on the record.)
 > >
 > > If we find ourselves debating the pros and cons of this
 > > issue, I want us to make a clear distinction between two
 > > distinct theses, which are conflated in the text of the
 > > comment.
 > >
 > > Thesis 1. In actual practice, a given IRI may be used on
 > > the Web to refer to two different things. This can happen in
 > > a variety of ways, including an IRI Collision in the sense
 > > described by http://www.w3.org/TR/webarch/#URI-collision,
 > > but also by IRIs being used in RDF with different intended
 > > meanings.
 > >
 > > Thesis 2. The phenomenon described in Thesis 1 can be
 > > usefully analyzed using the RDF semantics, by saying that
 > > the IRI might refer to different things in different graphs.

That phenomenon cannot be described by a *single* application
of the RDF Semantics.  But at a meta level, that phenomenon
can be described *perfectly* by the RDF Semantics, by applying
different interpretations.

David

 > >
 > > I agree with (1) but not with (2), for reasons which I
 > > can explain to anyone who actually wants to know. If the
 > > WG accepts the truth of (1) it is important that it not do
 > > so in a way which implies that it is accepting (2) as well.
 > >
 > > Pat
Received on Friday, 13 December 2013 19:04:32 UTC