Re: rdf inclusion from pat hayes on 2002-05-23 (www-rdf-logic@w3.org from May 2002)

From: pat hayes <phayes@ai.uwf.edu>
Date: Thu, 23 May 2002 17:11:26 -0500
To: Tim Berners-Lee <timbl@w3.org>
Cc: Jeff Heflin <heflin@cse.lehigh.edu>, www-rdf-logic@w3.org
Message-Id: <p05111705b91306226ef0@[65.217.30.61]>
>On Thursday, May 23, 2002, at 12:10 PM, patrick hayes wrote:
>
>>Tim Berners-Lee <timbl@w3.org> wrote:
>>>
>>>On Wednesday, May 8, 2002, at 02:34 PM, Jeff Heflin wrote:
>>>
>>>>Pat Hayes wrote:
>>>>>
>>>>>>Jeff Heflin wrote:
>>>>>>>  ... Since I was the initial proponent of daml:imports on the Joint
>>>>>>>  Committee, let me address this issue. You are absolutely correct that
>>>>>>>  the imports statement must be used. Simply refering to a namespace does
>>>>>>>  not include any ontology definitions. You must make the imports
>>>>>>>  statement explicit. Period. ...
>>>>>>
>>>
>>>Neither half of this is correct by itself, IMHO.  Each is too sweeping.
>>>
>>>When you use a term (eg Property) in a namespace, its meaning is defined
>>>by the definer of that namespace.  You are (unlike in english)
>>>bound to use a term according to its creator's definition.
>>
>>I don't understand what this is supposed to mean. If we are talking 
>>about assertional languages (which includes RDF, RDFS and DAML+OIL) 
>>then there is no such thing as a 'definition' of a term. There are 
>>only assertions involving it. If A publishes a web page using the 
>>term T and says a lot of stuff there about it, and then B uses that 
>>same term T, what exactly is B committed to? Does B, simply by 
>>*using* the term, assent to everything that A says about it?
>>
>
>It is really important that these languages is defined using URIs 
>opaquely, and so leveraging all the interesting properties of URIs 
>indirectly.  So we should be clear that I am talking about the 
>properties of URIs rather than properties of DAML, but the semantic 
>web is what you get when they work together.
>
>The deal with http: URIs is socially that one can get to own a space 
>of them, and within that space define what it is they identify.

Well, I think I see what you mean, but that word "define" is a 
minefield. There is (quite literally) *no way* in RDF(S) or DAML+OIL 
to 'define' anything at all. One can only make assertions, not make 
definitions. Were it not so, extensibility would be much more 
problematic.
OK, so maybe this a slightly pedantic point (though important, eg 
when discussing paradoxes versus contradictions); but there is 
another related but broader point, which is *how much* of what is 
found at a namespace site is considered to be part of the 
'definition'? Eg see my recent message to Dan Brickley concerning an 
rdf:type link to a daml: -defined term. Or what if the website lists 
a whole lot of RDF vocabulary but gives a commentary in English about 
what it is supposed to mean? If I now use a name from that namespace, 
am I committing to the English, or just to the RDF?

>  So for documents, <http://www.w3.org/TR/XHTML> is the HTNML spec 
>and w3c as publisher defines what is is corresponds to.  Similarly, 
><http://www.w3.org/2000/10/swap/string#endsWith> is something 
>published by W3C. It is defined to be a daml property meaning that 
>one string
>is a suffix of the other.   Anyone using that term commits to using 
>it to mean that, or there is a bug (or maybe fraud).  That 
>definition is stated in english, as we don't have a way of stating 
>it in DAML.

So you seem to be saying that if I use that name in some RDF, then I 
am committed to its meaning string-suffix *even though there is 
nothing machine-readable anywhere that specifies that meaning*. 
Really?? I find this amazing. On this view, then, we could achieve 
the SW trivially just be adding little 'definitions' to our webpages 
like this:

EnglishWord:beautiful a dpo:Class;
     EnglishWord:label  "the meaning of beauty";
     dpo:subClassOf EnglishWord:thing.

>We can, howver, publish a document 
><http://www.w3.org/2000/10/swap/string> which defines <#endsWith> as 
>being a rdf:Property with range and domain string.  That's all we 
>can do.

Indeed, and so it seems to me that that is all that it means, which 
amounts to virtually nothing. Which is fine for a kind of promissory 
note to do something later, but isn't enough to support any kind of 
useful SW activity.

>What is said in the namespace document is just a set of statements 
>from the point of view of DAML.  But socially, they are statements 
>you better be prepared to accept if you use the term.

Is this a society that speaks DAML or that speaks English? That is a 
serious question. Speaking DAML, agreeing to your definitions would 
be no hardship since they have no virtually no content. But if the 
society is supposed to speak English, it would be very dangerous 
indeed to ever use any identifier from another namespace since it 
might be defined there to mean anything, and there would be no way 
for me (speaking now as a DAML-savvy web agent) to find out what that 
could be.

>So yes, by using it, you can be held to the things said about it in 
>that document.

Even when they are said in a language that I have no way to 
understand or even to parse, and when their own syntactic 
declarations asserted that they were in a different language? Hardly 
seems fair to me.

>
>>Also, what is that 'bound to'? I see no way that any kind of web 
>>publication by A can *enforce* any behavior or assumptions on B. 
>>The best that one could do would be to have some public 'rules of 
>>inference behavior' which might for example say that if B publishes 
>>some content using a term 'belonging' to A, then C is entitled to 
>>use anything that A says using the term when drawing conclusions 
>>from what B says in what B publishes, and B is just as responsible 
>>for any such consequences as B would be had the consequences been 
>>drawn solely from what B has published. This is a kind of cut 
>>principle for inferential responsibility.
>
>Yes, more or less.  I wouldn't call them rules of inference 
>behavior. sounds a bit like telling people what inference steps to 
>take.

Right, that isn't what I meant; bad term. Its more like a kind of 
publicly agreed calculus for deciding what agents are committing 
themselves to when they publish content for use by other software 
agents.

>>It might be worth trying to draw up some of these rules. We are 
>>going to need them, once the SWeb gets going in the real world.
>>
>
>You bet.  We need them now in the real world.  The technical 
>architecture group is trying to answer "what does a document mean" 
>in general,

I would suggest that is a *very* bad way to start, like trying to 
make a balloon to fly to the moon. It might be better to try for 
something less grandiose and more likely to actually be of use, such 
as some commitment-to-entailments rules. You don't need to define 
meaning in general, even if y'all could do it. All you need to do is 
to give enough of a pragmatic constraint on meaning to allow an agent 
to proceed in a way that all agree is in some sense appropriate or 
suitable. The 'real' meaning of almost any document is almost 
certainly unknowable to anyone but its author, and maybe not even 
then. Many documents may not even have a 'real' meaning.

>and this is a crucial part of it.  Fortunately, in practice this is 
>well understood by web users.

The central issue for the SW is surely how it is going to be 
understood by Web software agents. And in any case, I don't think it 
is well understood by human web users; there is certainly a lot of 
disagreement visible on the discussion lists, at any rate.

>(An edge case lots of people are dealing with is how you bootstrap 
>anyone as having asserted anything.  There are plenty of documents 
>on the web which are not asserted by their publishers (check 
>www.archive.org) but there is an assumption that if someone asserts 
>something on a root page of a server of a domain they own, or 
>anything that links to,  then they are asserting it.)

Is there a single coherent notion of ownership that can be applied 
here? I might not own the server of my domain, for example. In fact I 
don't: it belongs to the State of Florida.

>
>>>The specifications of HTTP are defined such that if you dereference
>>>a term, then the information you get, modulo forgery, is  a published
>>>statement about the term by its owner.
>>
>>?? About the term?? Surely not. When I access 
>>http://www.google.com/, what I get isn't about 
>>"http://www.google.com/" . It might be about Google, but that's a 
>>different claim altogether.
>
>I can only make the architecture work at all when we use foo#bar for 
>abstract concepts. When you dereference foo you get and rdf file 
>which tells you some stuff about foo#bar.
>You can't use http://www.google.com/  to identify an abstract 
>concept, it has already been used to identify a document.

Well, its still not about 'foo#bar', more about whatever that term 
denotes. But OK, I'm happy with that (though it has not been 
incorporated into the RDF MT, since the '#' is still controversial, 
apparently.) That is what Jos DeRoo's engine assumes, and it seems to 
work quite smoothly.

>
>>>The information which anyone may get in this way may be useful,
>>>When you use such a term, you can be held to the implications of
>>>your statements according to the specifications.
>>
>>You can? By whom, and under what circumstances? What does 'being 
>>held to' amount to? I think that until questions like this are 
>>answered clearly, this whole discussion is meaningless.
>
>We have to draw a line.  It behooves us to define what the meaning 
>of a document is. That is, the meaning of B's document above was its 
>meaning with A's terms interpreted according to A.
>
>Then the lawyers take over to say who gets sued and to determine 
>whether there was a meeting of minds at the creation of the contract 
>and so on. We had better not go down that path, as we are providing 
>a language for social system to use, rather than designing the 
>social system.

OK, I agree. However its going to get complicated when software 
starts putting together content from many sources and drawing 
conclusions from it without human intervention. Some of this has 
already surfaced in expert systems technology, eg consider a 
diagnosis system incorporating human expertise from many human 
experts, and who is to blame when it makes a misdiagnosis. Right now, 
the lawyers tend to insist that a human makes the final decision 
after consulting the machine, largely to ensure that there is 
somebody to sue. But the SW can't be run on that basis, seems to me.

>But we must be as crisp as we possibly can about the meaning of a 
>document on the [semantic] web.

Well, the way to be very crisp is to insist that all meanings are 
provided as assertions in a specified formal language. Ironically, it 
seems to be largely the W3C folk who are most uncomfortable with this 
kind of crispness; for example, it would rule out English 
commentaries added to CWM :-).

>[..]
>
>>
>>>>This
>>>>isn't too bad when you can't express a contradiction, but once you
>>>>include additional semantic primitives (as DAML+OIL does) and scale for
>>>>use in distributed environments you are bound to get contradictions if
>>>>you simply merge the information sources. Some have suggested that in
>>>>such situations that you just select one of the contradictory axioms and
>>>>throw it out. However, if different systems choose different things to
>>>>throw out, then they no longer agree on the semantics of terms used. The
>>>>only way around this problem would be to have a consistent, world-wide,
>>>>determinisitc method for resolving these contradictions. Even if it was
>>>>possible to design such a thing, it would be highly impractical because
>>>>people often cannot agree on things (some of the worst cases of this
>>>>lead to wars, etc.) We can never expect the Semantic Web to be a single
>>>>consistent knowledge base!
>>>>
>>>You are absolutely right in that.  RDF was never expect to imply 
>>>that, and in fact a whole lot of energy has been expended from 
>>>time to time explaining that to people.
>>
>>No doubt the inventors of RDF did not intend to impose global 
>>consistency; such an ambition is obviously slightly insane. 
>>However, as a matter of fact, this insane assumption is built into 
>>the very architecture of RDF as it exists. RDF provides no way to 
>>make any other assumptions. It provides no way to agree or 
>>disagree, no way to define, no way to negotiate, no way even to 
>>question, any content. It only provides a simple way to make 
>>elementary assertions using a global vocabulary. It is obvious that 
>>this will not work unless there is some global coherence to the 
>>assertions made using the global vocabulary, unless there is some 
>>way to negotiate content and handle contradictions.
>>
>
>You imagine a world made up only of basic RDF.

It's not me that imagines it: it is imposed on the WebOnt group by 
its very charter, and regularly insisted upon by Dan C and others. 
Everything must be expressible in (basic) RDF syntax and compatible 
with (basic) RDF semantics. There is no conceivable technical or 
semantic justification for this idea other than the naive global 
database idea that you just rejected, so why are we still stuck with 
it?

BTW, that phrase 'basic RDF' seems to suggest that there is another, 
less basic, RDF somewhere.  I wish we could get past this 
terminological confusion. "RDF" means what the RDF Core WG is working 
on. Other, less basic, languages are not RDF.

>  If only we can get unhooked from the rat-holes and design more 
>stuff on top of RDF which allows precisely these things, then we can 
>solve the problem. You complain of the lego brick that it has no 
>windows. Well, stop complaining about the lego brick and we can make 
>compatible windows.

We can't make the windows if they have to be made *out of* the 
bricks. That is the central problem in WebOnt right now: we are being 
dragged back to square one over and over again by the silly 
requirement that everything be built 'on top of' RDF. We need to be 
able to get beyond this highly restrictive little language. Its not 
as though we don't know how. Better languages have been available for 
years. CLASSIC is a much more expressive language than RDF, for 
example. This isn't saying we should abandon RDF, only that we have 
to be able to use some other kinds of brick to do some of the other 
things.

>>>RDF provided only reification as a very crude tool for doing 
>>>fancier things.  I found that allowing an RDF set of statements 
>>>(graph, "model" for some, or better "formula") to be itself gets 
>>>you out of that hole, and allows you to handle a heterogeneous 
>>>world
>>
>>I don't see how. It makes RDF more logically expressive, but the 
>>globality assumption is built into the restriction on RDF that it 
>>use only urirefs as identifiers. Those are globally unique names, 
>>supposed to have a scope that covers the whole planet and 
>>transcends (in an ideal world) all future changes. That already 
>>embodies the insanity of the global-consistency picture.
>>
>
>Once you include quotation, then you are not bound in a system to 
>take all statements as equivalent.

Quoting a statement simply takes it out of the assertional picture 
altogether; it makes it into something like wallpaper. De-quoting, on 
the other hand, is dangerous.

>Globally unique identifiers are powerful and useful (witness www). 
>There are practical limitations to what we can achieve  (like Error 
>404).  The real engineering of URIs in fact addresses some of the 
>the issues of things changing,  things being fixed, and so on.  The 
>real engineering of real semantic web applications will use the URI 
>infrastructure and also need more things  such as vocabularies for 
>addressing, versions, and changes and so on.

OK, but the actual web doesn't really *deal* with this stuff, in 
fact: it just breaks and changes in manageable ways which *people* 
are able to handle. But for the SW we need to specify what software 
agents are going to do, and that requires at the very least some more 
attention to detail.

>There is the whole trust business. There is persistency. And so on.. 
>but we can make fairly simple building blocks to allow us to build 
>systems to address that.  
>All on top of RDF.

What a pity you added those last five words. I guess it depends on 
what you mean by 'on top of'. Of course RDF has its uses, I don't 
want to deny that. But for many people 'on top of' here seems to mean 
something much stronger, and that stronger (and more restrictive) 
vision is just mistaken, and the consequences of that mistaken idea 
are currently hampering progress to an alarming degree.

>Maybe it is necessary to have nested formulae to be able to accept 
>that RDF can work.

RDF doesn't ALLOW nested formulas, That is one of the things that is 
wrong with it.

>But in fact 99% of the stuff out there is data.  So the more we can 
>solidify of the simpler stuff the more applications like XMP rolling 
>the better.

Amen to that. Like I said, RDF is fine in its place.

Pat

-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Thursday, 23 May 2002 18:11:00 UTC