Re: AW: AW: AW: [ontolog-forum] Current Semantic Web Layer Cake from Pat Hayes on 2007-08-14 (semantic-web@w3.org from August 2007)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 14 Aug 2007 14:04:06 -0500
To: "Valentin Zacharias" <Zacharias@fzi.de>
Cc: "John F. Sowa" <sowa@bestweb.net>, "[ontolog-forum]" <ontolog-forum@ontolog.cim3.net>, "Ivan Herman" <ivan@w3.org>, "Juan Sequeda" <juanfederico@gmail.com>, "SW-forum list" <semantic-web@w3.org>
Message-Id: <p06230908c2e7a845d6af@[10.100.0.67]>
>Hi !
>
>please excuse the late answer. Thank you for your interesting comments (that
>in some parts i've to read up on). Below I've only included the three points
>where I still see substantial disagreement.
>
>[Valentin]
>>>  Now I understood your reply as meaning that
>>>  for you the OWL semantics are a
>>>  kind of minimalistic, basic semantics;
>>>  that people are free to use other
>>>  kinds of assumptions and reasoning on top, kind of at
>>>  their own risk.
>
>[Pat]
>>  Well, say RDF for the minimal, basic, then yes.
>>  And the 'own risk' needs to be clarified. People
>>  are, and always will be, free to take information
>>  on the Web and use it in any way they please,
>>  drawing conclusions at their own risk. The issue
>>  comes when they publish those conclusions for
>>  others to use, and whether that 'risk' is then as
>>  it were transmitted to others without their
>>  knowing about it. That seems unreasonable.

I didn't express this very well. See below for a clearer explanation.

>
>(Virtually?) all information published on the web is the result of some
>reasoning process not accessible to whoever is using this information - just
>consider the information processing involved before a (surely false)
>statement like "china population_total 1321851888"[1] gets created. I don't
>see why information created using Semantic Web data & information should be
>held to higher standards.
>
>At the end of the day it comes down to trust - having a checkable proof that
>shows how some information was derived from trusted sources is one way to
>increase my trust in some statements - but its not the only one (I may have
>blind trust in the information processing agent, may ask for second
>opinions, may check for "plausibility", i.e. whether it conflicts with
>anything I know ...)

All this is true. But you are making a very large, broad point, and I 
am making a smaller, technical, point. We cannot legislate, by 
imposing mere standards, that people will all tell the truth or all 
agree, of course. But what we can do is to ensure that simple 
technical problems do not impede trust or communication artificially. 
One step in this direction is to require that all communications of 
content on the Web use a formalism which has an agreed-on semantics 
which supports a monotonic form of reasoning. This still requires me 
to trust you when I read some RDF or OWL from your website and decide 
to accept it, and to trust Harry when I read some RDF or OWL from 
Harry's website and decide to accept it; but at  least it allows me, 
having decided to trust you both, to then have equal trust in any 
conclusions I might draw from the information you both give me, using 
the globally sanctioned semantics of the language we all use to 
communicate. And this does not require you and Harry, or the three of 
us, to be in any special collusion, or members of a private club or 
society which has agreed-on but unspoken conventions of meaning, such 
as an unstated UNA; or indeed even to be aware of one another: only 
to be all users of the global semantic web.

>And even if some agent is able to provide a checkable proof showing how a
>statement can be derived from trusted sources, I still need a bit of trust
>in this agent; I still need to believe that it has not 'intentionally'
>excluded other sources that state conflicting (inconsistent) information and
>hence would have invalidated any conclusions (like the rumoured real life
>behavior of some pharmaceutical companies to only selectively publish
>reports from clinical trials that worked well).

Actually I disagree here. If the agent supplies me with a checkable 
proof, then I can detect any places where that agent has excluded 
some source *that is relevant to the proof*. Of course I cannot 
distinguish between deliberate ignoring of contradictory evidence and 
mere ignorance, but a demonstration of a failure to find 
contradictory evidence, and a proof, are two different things. I 
wouldn't expect most SWeb agents to supply me with all arguments pro 
and con a given query, from anywhere on the Web (though I expect such 
services will in fact arise and will be used.)

>
>[...]
>
>[Pat]
>>>>I think you are muddling the chaotic state of the
>>>>Web with the idea that information on the Web
>>>>must be somehow faulty or inconsistent, [...]
>
>[Valentin]
>>>It is indeed my conviction that information on the
>>>web will always be
>>>faulty and inconsistent
>
>[Pat]
>>  It will be *globally* inconsistent, yes. So what?
>>  Nobody plans to download every Web ontology into
>>  one gigantic database. And the key issue for the
>>  global semantics is that it allow agents to
>>  reliably detect such inconsistencies, which is
>>  another argument for a classical model theory.
>
>I chose the examples of Wikipedia and cyc (later in the email) intentionally
>because they do not equal "every ontology on the web" ... but in the end
>only time will tell how important inconsistency will become. However,
>"faulty" is an entirely local property and to me an "assumption of
>correctness" seems every bit as "dangerous" as that of unique names or a
>closed world.
>
>(should we then assume that everything is false and give up? I think we need
>to accept that everything is fraud with uncertainty, 'dangerous'. Then we
>can see the 'dangers' of things like UNA and NAF in context and at the same
>time start to search for formalisms & algorithms with uncertainty at their
>core).

I want to be more careful about that 'core' idea. We need ways to 
reason about conflicts and uncertainty and trust, I agree. But making 
the medium of interchange itself into, say, a fuzzy or probabilistic 
logic achieves nothing towards this goal. One can lie about a 
probability just as easily as about a binary truth-value: all that 
happens when the medium of information exchange is made more complex 
is that the problems of determining trust themselves become (much) 
more complicated.

>  > Nobody plans to download every Web ontology into
>>  one gigantic database.
>
>Considering that Google already has a cached copy of the shallow web in one
>giant file system - why not? Not reason with everything at the same time

That was my point.

>-
>but have it indexed and at hand. I'ld expect that.

Sure. So what?

Pat

-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 14 August 2007 19:04:31 UTC