Re: rdf inclusion from Tim Berners-Lee on 2002-05-23 (www-rdf-logic@w3.org from May 2002)

From: Tim Berners-Lee <timbl@w3.org>
Date: Thu, 23 May 2002 14:32:12 -0400
To: patrick hayes <phayes@ai.uwf.edu>
Cc: Jeff Heflin <heflin@cse.lehigh.edu>, www-rdf-logic@w3.org
Message-Id: <66683EDB-6E7B-11D6-B3D9-000393914268@w3.org>
On Thursday, May 23, 2002, at 12:10 PM, patrick hayes wrote:

> Tim Berners-Lee <timbl@w3.org> wrote:
>>
>> On Wednesday, May 8, 2002, at 02:34 PM, Jeff Heflin wrote:
>>
>>> Pat Hayes wrote:
>>>>
>>>>> Jeff Heflin wrote:
>>>>>>  ... Since I was the initial proponent of daml:imports on the Joint
>>>>>>  Committee, let me address this issue. You are absolutely correct 
>>>>>> that
>>>>>>  the imports statement must be used. Simply refering to a 
>>>>>> namespace does
>>>>>>  not include any ontology definitions. You must make the imports
>>>>>>  statement explicit. Period. ...
>>>>>
>>
>> Neither half of this is correct by itself, IMHO.  Each is too sweeping.
>>
>> When you use a term (eg Property) in a namespace, its meaning is 
>> defined
>> by the definer of that namespace.  You are (unlike in english)
>> bound to use a term according to its creator's definition.
>
> I don't understand what this is supposed to mean. If we are talking 
> about assertional languages (which includes RDF, RDFS and DAML+OIL) 
> then there is no such thing as a 'definition' of a term. There are only 
> assertions involving it. If A publishes a web page using the term T and 
> says a lot of stuff there about it, and then B uses that same term T, 
> what exactly is B committed to? Does B, simply by *using* the term, 
> assent to everything that A says about it?
>

It is really important that these languages is defined using URIs 
opaquely, and so leveraging all the interesting properties of URIs 
indirectly.  So we should be clear that I am talking about the 
properties of URIs rather than properties of DAML, but the semantic web 
is what you get when they work together.

The deal with http: URIs is socially that one can get to own a space of 
them, and within that space define what it is they identify.  So for 
documents, <http://www.w3.org/TR/XHTML> is the HTNML spec and w3c as 
publisher defines what is is corresponds to.  Similarly, 
<http://www.w3.org/2000/10/swap/string#endsWith> is something published 
by W3C. It is defined to be a daml property meaning that one string    
is a suffix of the other.   Anyone using that term commits to using it 
to mean that, or there is a bug (or maybe fraud).  That definition is 
stated in english, as we don't have a way of stating it in DAML.

We can, howver, publish a document 
<http://www.w3.org/2000/10/swap/string> which defines <#endsWith> as 
being a rdf:Property with range and domain string.  That's all we can do.

What is said in the namespace document is just a set of statements from 
the point of view of DAML.  But socially, they are statements you better 
be prepared to accept if you use the term.

So yes, by using it, you can be held to the things said about it in that 
document.

> Also, what is that 'bound to'? I see no way that any kind of web 
> publication by A can *enforce* any behavior or assumptions on B. The 
> best that one could do would be to have some public 'rules of inference 
> behavior' which might for example say that if B publishes some content 
> using a term 'belonging' to A, then C is entitled to use anything that 
> A says using the term when drawing conclusions from what B says in what 
> B publishes, and B is just as responsible for any such consequences as 
> B would be had the consequences been drawn solely from what B has 
> published. This is a kind of cut principle for inferential 
> responsibility.

Yes, more or less.  I wouldn't call them rules of inference behavior. 
sounds a bit like telling people what inference steps to take.

> It might be worth trying to draw up some of these rules. We are going 
> to need them, once the SWeb gets going in the real world.
>

You bet.  We need them now in the real world.  The technical 
architecture group is trying to answer "what does a document mean" in 
general, and this is a crucial part of it.  Fortunately, in practice 
this is well understood by web users.

(An edge case lots of people are dealing with is how you bootstrap 
anyone as having asserted anything.  There are plenty of documents on 
the web which are not asserted by their publishers (check 
www.archive.org) but there is an assumption that if someone asserts 
something on a root page of a server of a domain they own, or anything 
that links to,  then they are asserting it.)

>> The specifications of HTTP are defined such that if you dereference
>> a term, then the information you get, modulo forgery, is  a published
>> statement about the term by its owner.
>
> ?? About the term?? Surely not. When I access http://www.google.com/, 
> what I get isn't about "http://www.google.com/" . It might be about 
> Google, but that's a different claim altogether.

I can only make the architecture work at all when we use foo#bar for 
abstract concepts. When you dereference foo you get and rdf file which 
tells you some stuff about foo#bar.
You can't use http://www.google.com/  to identify an abstract concept, 
it has already been used to identify a document.

>> The information which anyone may get in this way may be useful,
>> When you use such a term, you can be held to the implications of
>> your statements according to the specifications.
>
> You can? By whom, and under what circumstances? What does 'being held 
> to' amount to? I think that until questions like this are answered 
> clearly, this whole discussion is meaningless.

We have to draw a line.  It behooves us to define what the meaning of a 
document is. That is, the meaning of B's document above was its meaning 
with A's terms interpreted according to A.

Then the lawyers take over to say who gets sued and to determine whether 
there was a meeting of minds at the creation of the contract and so on. 
We had better not go down that path, as we are providing a language for 
social system to use, rather than designing the social system.

But we must be as crisp as we possibly can about the meaning of a 
document on the [semantic] web.

[..]

>
>>> This
>>> isn't too bad when you can't express a contradiction, but once you
>>> include additional semantic primitives (as DAML+OIL does) and scale 
>>> for
>>> use in distributed environments you are bound to get contradictions if
>>> you simply merge the information sources. Some have suggested that in
>>> such situations that you just select one of the contradictory axioms 
>>> and
>>> throw it out. However, if different systems choose different things to
>>> throw out, then they no longer agree on the semantics of terms used. 
>>> The
>>> only way around this problem would be to have a consistent, 
>>> world-wide,
>>> determinisitc method for resolving these contradictions. Even if it 
>>> was
>>> possible to design such a thing, it would be highly impractical 
>>> because
>>> people often cannot agree on things (some of the worst cases of this
>>> lead to wars, etc.) We can never expect the Semantic Web to be a 
>>> single
>>> consistent knowledge base!
>>>
>> You are absolutely right in that.  RDF was never expect to imply that, 
>> and in fact a whole lot of energy has been expended from time to time 
>> explaining that to people.
>
> No doubt the inventors of RDF did not intend to impose global 
> consistency; such an ambition is obviously slightly insane. However, as 
> a matter of fact, this insane assumption is built into the very 
> architecture of RDF as it exists. RDF provides no way to make any other 
> assumptions. It provides no way to agree or disagree, no way to define, 
> no way to negotiate, no way even to question, any content. It only 
> provides a simple way to make elementary assertions using a global 
> vocabulary. It is obvious that this will not work unless there is some 
> global coherence to the assertions made using the global vocabulary, 
> unless there is some way to negotiate content and handle contradictions.
>

You imagine a world made up only of basic RDF.  If only we can get 
unhooked from the rat-holes and design more stuff on top of RDF which 
allows precisely these things, then we can solve the problem. You 
complain of the lego brick that it has no windows. Well, stop 
complaining about the lego brick and we can make compatible windows.

>> RDF provided only reification as a very crude tool for doing fancier 
>> things.  I found that allowing an RDF set of statements (graph, 
>> "model" for some, or better "formula") to be itself gets you out of 
>> that hole, and allows you to handle a heterogeneous world
>
> I don't see how. It makes RDF more logically expressive, but the 
> globality assumption is built into the restriction on RDF that it use 
> only urirefs as identifiers. Those are globally unique names, supposed 
> to have a scope that covers the whole planet and transcends (in an 
> ideal world) all future changes. That already embodies the insanity of 
> the global-consistency picture.
>

Once you include quotation, then you are not bound in a system to take 
all statements as equivalent.

Globally unique identifiers are powerful and useful (witness www).  
There are practical limitations to what we can achieve  (like Error 
404).  The real engineering of URIs in fact addresses some of the the 
issues of things changing,  things being fixed, and so on.  The real 
engineering of real semantic web applications will use the URI 
infrastructure and also need more things  such as vocabularies for 
addressing, versions, and changes and so on.  There is the whole trust 
business. There is persistency. And so on.. but we can make fairly 
simple building blocks to allow us to build systems to address that.   
All on top of RDF.

Maybe it is necessary to have nested formulae to be able to accept that 
RDF can work.  But in fact 99% of the stuff out there is data.  So the 
more we can solidify of the simpler stuff the more applications like XMP 
rolling the better.


> Pat
Received on Thursday, 23 May 2002 14:32:21 UTC