Re: New issue - Meaning of URIs in RDF documents from Tim Berners-Lee on 2003-07-16 (www-tag@w3.org from July 2003)

From: Tim Berners-Lee <timbl@w3.org>
Date: Wed, 16 Jul 2003 18:17:52 -0400
To: pat hayes <phayes@ihmc.us>
Cc: www-tag@w3.org
Message-Id: <57C4F512-B7DB-11D7-94B5-000393914268@w3.org>
On Wednesday, Jul 16, 2003, at 12:56 US/Eastern, pat hayes wrote:

> There are several statements made in the following proposal which 
> should not be made by any WG. To make authoritative assertions of 
> propositions which are clearly or provably false does not make them 
> true; it only destroys public trust in the source making the silly 
> assertions.

I don't know whether these things are easily done in email in www-tag, 
as the misunderstandings are often about use of words, and so if 
endeavour to use yours, we may find other people get other 
misunderstandings.  Note I was using english, not model theory.  You 
use MT words, but it may be that MT can't express the english talking 
about for example the real world.

> 1. " each URI
> identify one thing ("Resource": concept, etc)."
>
> Exactly what is meant by "identify" here is not exactly clear, but if 
> this means something  close to what it usually means then it is simply 
> untenable to claim that all names identify one thing.

I am making the claim only for RDF statements in a global context, in 
for example an email sent between two people who don't know each other 
but both access to the web.  And I am describing, if you like, a 
perfect platonic design, to which we can aspire, though social and 
engineering factors limit our ability to implement it perfectly.  Like 
with all technical specs, the fact of imperfect adherence in some cases 
does not detract from the importance of having made the perfect 
idealistic design which has provable properties. One deals with 
deviations from the perfect in a form of perturbation theory.

>  Existing W3C standards already provide counterexamples: what single 
> thing is identified by the URI reference  
> http://www.w3.org/2000/01/rdf-schema#Class? This is supposed to 
> *denote* the class of all RDFS classes; but that is not a single 
> well-defined notion, by the very nature of formal semantics: it varies 
> from interpretation to interpretation.

You say "by the very nature of formal semantics".  The formal systems 
we are using are not capable of defining such a thing?  This is in fact 
true, as the logics like OWL light  which we like to use can't handle 
classes of classes being classes.  So maybe we have a single thing, 
imperfectly (from your point of view) defined.

And there is the problem that MT systems consider all possible 
interpretations of the data, in any possible worlds.  If you take the  
case of an identifier for  pat hayes <phayes@ihmc.us>, for example, the 
non-logician would consider that it identified one person and get on 
with their lives, whereas in MT one would consider all possible 
interpretations of all the data one had in store and consider the 
possible things the identifier might denote in each in interpretation. 
And that would be an essential exercise, and could be useful when 
looking at phrases about Pat in quoted material (She thought "Pat Hayes 
was a pirate of the early seventeenth century"), etc.  But these 
excercise blur the usefulness of being able to use a URI to refer to a 
person in our day to day communication.

So, there is no "single, well-defined" thing denoted by rdfs:Class?  I 
didn't say it was well defined, by your definition of well defined. 
However, the goal is to be single, in some sense.  That is, as we add 
information about it, that information should not be inconsistent.  You 
can thing of it denoting different things in different systems, but how 
are those things "different" apart from the fact they are in different 
systems?  We say and every owl:class is an rdfs:Class. That allows us 
to deduce things about some classes. Suppose we make other assertions 
about rdfs:classes, is it allowable for us to be able to make a 
contradiction? I would say not.  Currently, different logical systems 
can deduce different things, but the important point is that they are 
talking about the same thing when they use the same URI.

The OWL pat hayes and the OWL light Pat Hayes are in different worlds 
from the point of view of logic, but we are using

> Perhaps 'identify' doesn't mean 'denote' or 'refer to'. What does it 
> mean, then? Note that if we were to say that 'identify' means MORE 
> than simply 'denote' or 'refer to' - if, say, it also has a 
> connotation that the URI can be somehow used to retrieve some 
> information about the referent - then the claim would become even more 
> false.
>

Well, you can't be more false than false, can you? :)  Eh, but I 
wouldn't put it past you ;-)

When one retrieves a document, one gets information which its publisher 
says, and one can believe or not.  But using a term does (modulo social 
things such as fraud and engineering things such as broken cables) 
commit you to the term owner's definition of it, and the document they 
publish at its URI is taken by design to be information deemed shared 
by those using the term.  That's the contract. If you don't like it, 
don't use too many terms.

> See 6 below also.
>
> 2."  An RDF statement "S P O" means that a given binary relation
> identified by P holds between to things identified by S and O. (S, P
> and O are URIs)"
>
> This would be true if "identified" meant "denoted"; but then it 
> should, if stated strictly, have the qualification "in a given 
> interpretation", illustrating the reason why the first claim is false.
>
> 3. " The OWL specification is a vocabulary of properties allowing an 
> RDF
> document to say things about RDF Properties"
>
> First, OWL is more than an RDF vocabulary: it is an RDF vocabulary 
> with a particular semantics applied to it.

Like every RDF vocabulary.  What is interesting about OWL is that for 
some of the vocabulary the properties of the Properties can be defined 
in math.  But basically OWL isn't any different from the calendar event 
vocabulary.  The only reason that an RDF calendar event has meaning is 
the semantics of that vocabulary.

(For you, this may seem perturbing or to say that the logic itself, the 
thing you tend to define first, is actually only defined in the data 
language. But it works. You end up where you wanted to be, with OWL and 
its semantics, I'm just explaiing to you how, from RDF's point of view, 
you got there.)

> It is the semantics which allows the document (strictly, the RDF 
> graph) to make nontrivial assertions, most of which cannot be made in 
> RDF;

Excuse me you *are* making them in RDF.
By "make non-trivial assertions" you mean "make assertions which 
actually mean something".  It is indeed true that the assertions only 
have meaning because of the semantics associated with the OWL 
vocabulary.  What is misleading is if you fondly image that *other* 
people's vocabularies are different, and have no meaning.  OWL is 
wonderful, but not special. To RDF.

> so it is the OWL document making the assertions, not the RDF document 
> (true, an OWL document can be described as an RDF document with an OWL 
> semantics, but it is misleading to use the syntactic criterion when we 
> use phrases like "say things about".)

In the big wide world, an OWL document is, at base an RDF document and 
only an RDF document.  It only has its semantics because of the OWL 
semantics, and the OWL semantics only apply because the owners of the 
vocabulary have defined those semantics, and made it clear (by 
publication of specifications, and increasingly with machine-readable 
stuff accessible from the URIs concerned) what those semantics are.

> Second, it is misleading to claim that the assertions made in an OWL 
> document are about RDF properties. They may be: but they may also be 
> about classes or individuals (ie anything); in most cases it will be 
> impossible to say what particular thing they are 'about'. (The use of 
> this phraaseology, by the way, is odd: most texts in most languages, 
> formal or informal, cannot be said to be 'about' any one thing.)

For and RDF statement, there is a clear subject, and we say that the 
statement is (loosely) about the subject, (or the referent of the 
subject, depending on the way you use "subject", which nitpick we nee 
not get into now.)

Absolutely true. Many OWL statements are about classes, for example.

> 4.  "The combination of these architectural parts allows information 
> to be
> published so that the recipient of an RDF statement  "S P O" can, by
> dereferencing P, get information about the relation being asserted. "
>
> Wrong. First, there is no particular semantic importance attached to 
> the P part of the triple.

Excuse me.  p is associated with the relation.  I had understood that 
the semantics of s p o were the relation R(s,o) where R is identified 
by p.
Did I get it the wrong way around?


>  Properties have no special status in RDF. Second, the relation is not 
> being asserted: the triple is.

Wordmongery. Fie on you.   The fact that the relation holds, or that a 
statement of the relation is true is being asserted?  The semantics of 
the triple are consistent only with  interpretations in which the 
relationship corresponding to p holds between the resources 
corresponding to s and o.

> Third, there is no particular reason why dereferencing P will get you 
> to the information you might require in order to draw the appropriate 
> conclusions; and indeed most RDF applications would not work if this 
> were an architectural requirement.

No one said "will"  and no one said "require".
(You are arguing by exaggeration of claim.)
IF a representation of a document is retrieved the there is an argument 
that you MAY use it.
There is no argument that you MUST.

> Finally, this conclusion does not follow from the architectural points 
> made previously.

It obviously does.
;-)

> 5. "This information, directly or indirectly acquired, may be
> human-readable and/or machine readable, the latter including for
> example ontological statements in OWL, or rules, or other logical
> expressions."
>
> This is an extremely contentious and potentially confusing claim. It 
> is *impossible* for software agents to respond to or utilize 
> "information" which is only human-readable: it must be 
> machine-readable.

However, it is possible for human agents to us that information.  
Classically what happens is that a human looks up the document, gets 
the hang of what is happening, and writes software which processed 
information correctly.  This is the way most OWL software is written:  
by humans reading the spec.

Machine-readable information about vocabularies is fairly limited. With 
RDFS, there is not much one can say, though what one can say is useful. 
With OWL, there is more.


> So to lump these categories under the single heading of 'information' 
> is an architectural disaster, if 'recipient' in the previous sentence 
> is supposed to refer to an architectural element (such as an agent of 
> some kind). This point is not new, of course: it has been made already 
> in many intense discussions and debates, many of them archived.

There are indeed as many people (like Tim Bray) arguing that the most 
important thing is human-readable information as there are people (like 
you) who really can't see the point of it and only think of automated 
agents.  But lumping it together in fact is useful, particularly as 
much information be in both camps, such as when I read an OWL ontology 
myself.

> 6. "-  the architecture is that a single meaning is given to each URI "
> and
> "- the architecture does not permit the meaning of a URI to be changes
> by consistent misuse by others"
>
> These are IMPOSSIBLE architectural requirements

Just as the architectural requirement that for any link with href=x 
there exists a document web(x).

> There are no precise theories of meaning which make such statements 
> other than fatuous (except when we are talking about programming 
> languages: but the universe as whole does not satisfy the second 
> recursion theorem.)

We will always be a challenge for those of you who make these precice 
theories.

>  There isn't a 'single meaning' for the addition sign, or the 
> multiplication sign;

Exactly not.  These are NOT URIs. They are overloaded rather short 
symbols.

  It is maybe from working with these, and with the well-known and quite 
non-URI-like properties of natural language words,  that you may have 
become blind to the advantages of an architecture where we say "This 
system is different from natural languge: we design it such that each 
URI identifes  (doenotes?) one and only one concrete thing in the real 
world or one and only one globally shared concept".

> and the 'not changed by others' condition is either clearly false (in 
> some views of 'social meaning') or completely irrelevant (on 
> referential theories of meaning); either way, it isnt much use 
> insisting on it as an architectural condition.

Pat, we are not analyzing a world, we are building it.  We are not 
experimental philosophers, we are philosophical engineers.  We declare 
"this is the protocol". When people break the protocol, we lament, sue, 
and so on. But they tend to stick to it because we show that the system 
has very interesting and useful properties.

Example: The Internet mail "From:" field is defined to assert the 
relationship between the message and the email mailbox of the sender of 
the message. Spammers start to screw the system up by using it to put 
in whatever will get through filters. The system slowly reacts socially 
by putting them in jail for identity theft.  They said that the "From" 
field meant whatever they said it meant, as senders of the message. The 
Internet spec said it meant a particular thing. The courts sided with 
the spec.


> (FWIW, this seems to me to confuse meaning with intended meaning. 
> Intended meanings, however, are not the kind of thing that one can 
> impose *architectural* conditions on; they are more a matter for 
> courts and priests to decide.)

The architecture, if you like, defines an "authoritative" or 
"definitive" meaning, to which "meaning" in wittgensteinian sense and 
"intended menaing" in a ethical or legal sense generally approach as 
closely as they can, and close enough for the system to work and be 
unbelievably useful to millions of people.

> Having a definitive ontology does not provide a unique meaning, by the 
> way.
>
> 7. "The community needs
> 1) A concise statement of the above architectural elements from
> different specs in one place, written in terms which the ontology
> community will understand, with pointers to the relevant 
> specifications."
>
> Maybe, if I could make the suggestion without seeming to commit 
> lese-majesty, it would be a good strategy for the W3C, rather than 
> trying to render nonsense "in terms that the ontology community will 
> understand", to ask if it might possibly learn something from actually 
> *listening* to the ontology community; or at any rate, to anyone with 
> a grasp of basic 20th-century results in linguistic semantics.

We are building a new system.  We can design it differently from 
existing linguistic systems.  Toto, we are not in Kansas any more.

One of the things which previous forays into this area have 
demonstrated is that listening is necessary on both sides. Another is 
that when it happens, we can actually se eye to eye. And another is 
that you are one of the better people at being able to listen, figure 
out what it is we mean, and close the gap.

Tim


> Pat Hayes
>
> --------------
>
> Resent-From: tag@w3.org From: Tim Berners-Lee < timbl@w3.org >
> Date: Mon Jun 30, 2003  15:05:02 US/Eastern
> To: tag@w3.org Subject: New issue - Meaning of URIs in RDF documents
>
>
> The Semantic Web Coordination group at its meeting of 2003-06-30  and
> passed on to the Tag the issue which had been loosely described in RDF
> circles as "social meaning".  As background,
> - The URI specification defines URI syntax and explains that each URI
> identify one thing ("Resource": concept, etc).
> - RDF documents use URIs as identifiers for things including for
> relations. An RDF statement "S P O" means that a given binary relation
> identified by P holds between to things identified by S and O. (S, P
> and O are URIs)
> - The HTTP specification provides for a set of URIs which have (a)
> delegated ownership (b) publication and retrieval of information
> resources.
> - The OWL specification is a vocabulary of properties allowing an RDF
> document to say things about RDF Properties
> - The TAG has written on the desirability of using dereferencable URIs,
> and of actually publishing relevant and useful information.
> The combination of these architectural parts allows information to be
> published so that the recipient of an RDF statement  "S P O" can, by
> dereferencing P, get information about the relation being asserted.
> This information, directly or indirectly acquired, may be
> human-readable and/or machine readable, the latter including for
> example ontological statements in OWL, or rules, or other logical
> expressions.
> The community needs
>
> 1) A concise statement of the above architectural elements from
> different specs in one place, written in terms which the ontology
> community will understand, with pointers to the relevant 
> specifications.
>
>
> 2) Some outline guidance on specific questions brought up in email
> questions
> - Is a given inference engine expected to take into account a given
> document under given circumstances?
> - how does one avoid having to commit to things one does not trust?
> - etc etc etc
>
> 3) There may be some need to clarify frequent misunderstandings by
> making some things clear.
> -  the architecture is that a single meaning is given to each URI (such
> as P), that the URI ownership system makes statements by owners
> authoritative weight, despite what other documents may say.
> - the architecture does not permit the meaning of a URI to be changes
> by consistent misuse by others.
> - that use of a URI in RDF implies a commitment to its ontology, and if
> there is doubt as to what ontology that is, the web may be used to
> resolve it.
> - that the web is not the final arbiter of meaning, because URI
> ownership is primary, and the lookup system of HTTP is though important
> secondary. (That is, if you hack a web server's ontology files, you do
> not change hat the URI means, you just break a machine for a while)
> - etc etc.
>
> The proposal is that a draft finding be written which pulls this
> together, with elaborations pointing into the various specs. Members of
> the SWCG have volunteered and some members of members of the SWCG have
> been volunteered to read early versions.
> tim bl
>
> -- 
>
> ---------------------------------------------------------------------
> IHMC      (850)434 8903 or (650)494 3973   home
> 40 South Alcaniz St.      (850)202 4416   office
> Pensacola              (850)202 4440   fax
> FL 32501                  (850)291 0667    cell
> phayes@ihmc.us       http://www.ihmc.us/users/phayes
Attachments

text/enriched attachment: stored
Received on Wednesday, 16 July 2003 18:31:07 UTC