RE: imports and commitment - troubled by today's call... from John Black on 2003-09-27 (public-sw-meaning@w3.org from September 2003)

From: John Black <JohnBlack@deltek.com>
Date: Sat, 27 Sep 2003 07:44:12 -0400
To: "Jim Hendler" <hendler@cs.umd.edu>, <public-sw-meaning@w3.org>
Message-ID: <D3C8F903E7CC024C9DA6D900A60725D90258D455@DLTKVMX1.ads.deltek.com>
> -----Original Message-----
> From: Jim Hendler [mailto:hendler@cs.umd.edu]
> Sent: Friday, September 26, 2003 5:00 PM
> To: public-sw-meaning@w3.org
> Subject: imports and commitment - troubled by today's call...
> 
> 
> 
> Sorry I could stay to end of phone call today -- there were a couple 
> of things being discussed I'd like to have followed up on -- one in 
> particular is bothering me
> 
> It has to do with imports vs. commitment to what something claims. 
> Tim said that he viewed owl:imports as more or less a "#include" 
> mechanism, and I agree.  However, if referring to a URI on another 
> page is also like a "#include" then I think we break the Semantic Web 
> -- that is, the "imports closure" of a SW document could conceivably 
> end up being a major portion of the whole semantic web if we are 
> successful and end up with lots of things pointing at each other 
> (which is certainly my vision of the SW, and I think also Tim's)
> 

I think this closure of context is essential.  I don't think it will
break the Semantic Web, I think it will *make* the semantic web.  
Without this closure of context this project could become a YARS, 
yet-another-reasoning-system.  One thing that distinguishes the 
current effort is that statements are embeded in the context of the 
web.  Now if I had to constantly compute this closure on my laptop, 
that would be impossible.  Similarly Pat warns in 
http://lists.w3.org/Archives/Public/public-sw-meaning/2003Sep/0139.html,
"...*any* adult human being brings to bear an incredibly rich 
context of meaning and interpretation to any linguistic act. There 
is no hope of having software do anything remotely like this in the 
forseeable future;..."  I disagree. I think we will use huge engines 
running at places like Google to compute these closures.  I just 
searched the Web for the phrase, "semantic network".  According to 
this search application, I reviewed about 3,307,998,701 pages.  Of 
these, this search engine located about 26,400 pages in 0.12 seconds.  
Frankly, this does seem like software doing something remotely like 
the kind of ontological closure you're saying will break the semantic 
web and Pat says won't happen in the foreseeable future.

> I don't have a solution in mind yet, but I really want to be able to 
> tease apart a few different situations:
> 
> 1 - I think the NCI ontology (17000 classes) is great, and I want to 
> let people coming to my documents know that my document concurs with 
> it
>     this can be handled by my saying me document owl:import NCI 
> (although that might cause me to have to read in a whole lot of 
> classes)
> 
> 2 - I look at the NCI ontology and examine a small portion of it.  I 
> think that part is good (the part on oncogenes), but I'm not sure 
> about the whole document (which contains stuff about lifestyles, 
> about fast food restarants, and lots of other things) -- I might like 
> my document to say that I use certain terms from that document, but 
> am not willing to "commit" to the others (I don't say I disagree with 
> the others, just that I'm not willing to buy in)
>     I haven't seen any mechanism to do this, although at one point 
> Bijan suggested a mechanism in which the owl:ontology statement could 
> include a set of URIs from that or other documents and give them a 
> name together.  This was roundly rejected by Peter and Ian, among 
> others, but I still think it had merit (esp in light of the 
> discussion on this list)
 
If constant, aggressive, far-reaching closure of context is essential, 
as I claim above, then I agree, some sort of sifting or selection of 
the relevant portions of an ontology would be a good idea.

> 3 - I'm looking for a way to mark up some instance data, and I have a 
> database of information about genetic loci - I see that the NCI 
> ontology has a list of these loci (MYC, PVT, etc) so in my document I 
> define some properties of the nci:locus class and assert my 
> information -- this seems valuable to me because I figure other 
> people will decide if they like the NCI ontology, and if they do 
> maybe they'll find my data and properties useful.  (This is a real 
> situation we're trying to encourage some large genetic DB providers 
> to buy into) - the user also may find some other cancer ontologies 
> and define some properties on the terms from that as well..
>      Difference in this case from 2 is that this user is 
> trying to add 
> their own information to be used with some ontology, and doesn't 
> really care what is in the parent ontology other than some particular 
> class they want to use - perhaps the same mechanism could be used as 
> in 2, but might be a lot of extra work over just using a URI 
> reference   (This is my personal favorite for what a URI reference 
> without an imports statement should do)
> 
> 
> In essence, I like Tim's idea of a protocol, and that somehow it is 
> between the user and the definer of the URI, but I'm worried that if 
> it becomes transitive (i.e. protocol gets B to understand A, gets C 
> to understand B, gets D to understand C, ...) we cannot distinguish 
> the cases above, or worse, we end up with an everything imports 
> everything type situation (I recently created a version of part of 
> the NCI ontology that includes a reference to something in CYC and to 
> something in WordNet -- my document contains about 20 lines, but if 
> you have to bring in all those things to "understand" it, you get 
> well over a million triples -- this strikes me as a problem)
 
As I say, this is the very thing that distinguishes the current efforts 
from dozens or hundreds of reasoning systems that have been built before 
it.  So I agree, but I would qualify and say it is a *research* 
problem.
 
> -- 
> Professor James Hendler				  
> hendler@cs.umd.edu
> Director, Semantic Web and Agent Technologies	  301-405-2696
> Maryland Information and Network Dynamics Lab.	  
> 301-405-6707 (Fax)
> Univ of Maryland, College Park, MD 20742	  *** 
> 240-277-3388 (Cell)
> http://www.cs.umd.edu/users/hendler      *** NOTE CHANGED 
> CELL NUMBER ***
> 
> 
John Black
Senior Software Architect
Deltek Systems, Inc.
13880 Dulles Corner Lane
Herndon, VA 20171
JohnBlack@deltek.com
703-885-9646 - Office (Tues,Wed,Thur)
434-964-1936 - Home Office (Mon,Fri)
434-825-3765 - Mobile (Anytime)
Received on Saturday, 27 September 2003 07:48:35 UTC