RE: imports and commitment - troubled by today's call... from pat hayes on 2003-09-30 (public-sw-meaning@w3.org from September 2003)

From: pat hayes <phayes@ihmc.us>
Date: Mon, 29 Sep 2003 20:20:32 -0500
To: "John Black" <JohnBlack@deltek.com>
Cc: public-sw-meaning@w3.org
Message-Id: <p06001f38bb9e19bd920a@[10.0.100.25]>
>  > -----Original Message-----
>>  From: Jim Hendler [mailto:hendler@cs.umd.edu]
>>  Sent: Friday, September 26, 2003 5:00 PM
>>  To: public-sw-meaning@w3.org
>>  Subject: imports and commitment - troubled by today's call...
>>
>>
>>
>>  Sorry I could stay to end of phone call today -- there were a couple
>>  of things being discussed I'd like to have followed up on -- one in
>>  particular is bothering me
>>
>>  It has to do with imports vs. commitment to what something claims.
>>  Tim said that he viewed owl:imports as more or less a "#include"
>>  mechanism, and I agree.  However, if referring to a URI on another
>>  page is also like a "#include" then I think we break the Semantic Web
>>  -- that is, the "imports closure" of a SW document could conceivably
>>  end up being a major portion of the whole semantic web if we are
>>  successful and end up with lots of things pointing at each other
>>  (which is certainly my vision of the SW, and I think also Tim's)
>>
>
>I think this closure of context is essential.  I don't think it will
>break the Semantic Web, I think it will *make* the semantic web. 
>Without this closure of context this project could become a YARS,
>yet-another-reasoning-system.  One thing that distinguishes the
>current effort is that statements are embeded in the context of the
>web.  Now if I had to constantly compute this closure on my laptop,
>that would be impossible.

I think you just made Jim's point. If imports 
really does mean 'include' then your laptop would 
be trying to compute the closure.

>  Similarly Pat warns in
>http://lists.w3.org/Archives/Public/public-sw-meaning/2003Sep/0139.html,
>"...*any* adult human being brings to bear an incredibly rich
>context of meaning and interpretation to any linguistic act. There
>is no hope of having software do anything remotely like this in the
>forseeable future;..."  I disagree. I think we will use huge engines
>running at places like Google to compute these closures.

Different topic. Google doesn't do any meaning 
analysis. (Well, that is not quite fair: but it 
doesnt do what I was talking about.)

>I just
>searched the Web for the phrase, "semantic network".  According to
>this search application, I reviewed about 3,307,998,701 pages.  Of
>these, this search engine located about 26,400 pages in 0.12 seconds. 
>Frankly, this does seem like software doing something remotely like
>the kind of ontological closure you're saying will break the semantic
>web and Pat says won't happen in the foreseeable future.

I was talking about something else entirely. 
(Now, to be fair, you can make out a case - not 
one that I find persuasive, but there those that 
do - along the lines that the sheer size of the 
database represented by the Web makes all this 
contextual sophistication irrelevant, and the 
human-style smarts can and should be replaced by 
raw statistical techniques.  But even if these 
were true, it would require a Google on every 
laptop.)

>
>>  I don't have a solution in mind yet, but I really want to be able to
>>  tease apart a few different situations:
>>
>>  1 - I think the NCI ontology (17000 classes) is great, and I want to
>>  let people coming to my documents know that my document concurs with
>>  it
>>      this can be handled by my saying me document owl:import NCI
>>  (although that might cause me to have to read in a whole lot of
>>  classes)

Surely the key point is that to say its great and 
to tell people that you agree with it and that 
they ought to take it seriously when performing 
inferences from what you say, etc. - call this 
*endorsing* it - is one thing, and copying it all 
out into your address space is something else 
entirely. And to use the latter to do the former 
is like trying to move someone's house next to 
your finger instead of pointing to it (or writing 
their address on a letter). The SW really does 
need an endorsing technique, to be sure; in fact 
it will pretty soon need a whole range of 
endorsing, qualifying, explicitly rejecting and 
generally being able to comment on other SW 
content sources. None of this has got much to do 
with COPYING, however.

>  >
>>  2 - I look at the NCI ontology and examine a small portion of it.  I
>>  think that part is good (the part on oncogenes), but I'm not sure
>>  about the whole document (which contains stuff about lifestyles,
>>  about fast food restarants, and lots of other things) -- I might like
>>  my document to say that I use certain terms from that document, but
>>  am not willing to "commit" to the others (I don't say I disagree with
>>  the others, just that I'm not willing to buy in)
>>      I haven't seen any mechanism to do this

Why do you need one? Just USE THOSE TERMS in your 
document. That's all you need to do. Then 
anyone/thing reading your document will find them 
,and they will direct him/her/it to NCI, where it 
will find some stuff which it can use in drawing 
conclusions.  If you don't use any of the fast 
food vocabulary and there isnt any inference path 
from what you have used to it, then it won't get 
used (or at any rate, not because of anything you 
have said.)

>, although at one point
>>  Bijan suggested a mechanism in which the owl:ontology statement could
>>  include a set of URIs from that or other documents and give them a
>>  name together.  This was roundly rejected by Peter and Ian, among
>>  others, but I still think it had merit (esp in light of the
>>  discussion on this list)
>
>If constant, aggressive, far-reaching closure of context is essential,
>as I claim above, then I agree, some sort of sifting or selection of
>the relevant portions of an ontology would be a good idea.
>
>>  3 - I'm looking for a way to mark up some instance data, and I have a
>>  database of information about genetic loci - I see that the NCI
>>  ontology has a list of these loci (MYC, PVT, etc) so in my document I
>>  define some properties of the nci:locus class and assert my
>>  information -- this seems valuable to me because I figure other
>>  people will decide if they like the NCI ontology, and if they do
>>  maybe they'll find my data and properties useful.  (This is a real
>>  situation we're trying to encourage some large genetic DB providers
>>  to buy into) - the user also may find some other cancer ontologies
>>  and define some properties on the terms from that as well..
>>       Difference in this case from 2 is that this user is
>>  trying to add
>>  their own information to be used with some ontology, and doesn't
>>  really care what is in the parent ontology other than some particular
>>  class they want to use - perhaps the same mechanism could be used as
>>  in 2, but might be a lot of extra work over just using a URI
>>  reference   (This is my personal favorite for what a URI reference
>>  without an imports statement should do)

Well, it can't ensure that others will use your 
stuff, presumably: but I agree this seems 
sensible and again, you don't need to do anything 
SW-special: if others start using your URIs then 
there is an access path back from their use to 
your SubCLass assertion to the NCI origin for the 
superclass.

All of this seems to be what might be called 
normal-usage on the SW. Ontologies use terms 
originating in other ontologies, using the normal 
Web linking to provide the traces back for users, 
so that inference engines can hopefully find 
relevant content. To the extent that the links 
aren't broken and the SW-markup composers have 
their act together (failure in both of which will 
be rapidly detectable), this will probably work 
reasonably well with the current 'design'  or at 
any rate the emerging best-practice. I take it 
that our current goal here is to articulate this 
best-practice vision as concisely and 
'reasonably' as possible without treading on 
anyone's methodological toes.

>  >
>>
>>  In essence, I like Tim's idea of a protocol, and that somehow it is
>>  between the user and the definer of the URI, but I'm worried that if
>>  it becomes transitive (i.e. protocol gets B to understand A, gets C
>>  to understand B, gets D to understand C, ...) we cannot distinguish
>>  the cases above, or worse, we end up with an everything imports
>>  everything type situation (I recently created a version of part of
>>  the NCI ontology that includes a reference to something in CYC and to
>  > something in WordNet -- my document contains about 20 lines, but if
>>  you have to bring in all those things to "understand" it, you get
>>  well over a million triples -- this strikes me as a problem)
>
>As I say, this is the very thing that distinguishes the current efforts
>from dozens or hundreds of reasoning systems that have been built before
>it.  So I agree, but I would qualify and say it is a *research*
>problem.

I think we can do better than that. Imports 
needn't be understood as 'now you must copy the 
transitive closure here before proceeding'. A 
much better way to interpret it would be 
something like 'this ontology is intended to be 
read while supposing that  ontology is true', ie 
that this ontology gives you, the reader, an 
explicit licence to draw any conclusions from the 
imported ontology as well as, and together with, 
it, the importing ontology. Not, of course, that 
you actually *need* that licence, since you are 
free to draw any conclusions that you feel like 
drawing from whatever sources you, in your 
wisdom, decide to trust for whatever purposes 
suit you best.  But (a) the importing ontology is 
trying to be helpful and (b) the endorsement 
links may add up a kind of SW-googlizable subWeb 
if there are enough of them, which I guess 
everyone hopes is going to happen some day , and 
(c) if your inference screws up, your lawyers may 
be able to put some of the blame on me even if 
the cause of the screwup is in the imported 
ontology; so I have, by taking this risk, 
exhibited some reason why if you trust me, you 
maybe should also trust that.

Pat Hayes
-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 29 September 2003 21:20:37 UTC