- From: Jim Hendler <hendler@cs.umd.edu>
- Date: Mon, 30 Sep 2002 12:26:54 -0400
- To: Jeff Heflin <heflin@cse.lehigh.edu>
- Cc: WebOnt <www-webont-wg@w3.org>
At 10:46 AM -0400 9/30/02, Jeff Heflin wrote: >Jim, > >Thanks for the arguments in favor of and against proposal 2. I think it >is important that all the pros and cons be identified and we have a >debate on this so that the WG can truly make the best decision, whether >that be in favor of proposal 1, 2 or something as yet undetermined. > >That said, I'd like to discuss your points: > >Proposal #1 requires a new MIME type >------------------------------------- >I find this an interesting point. Does the W3C have any documentation >that say when a new MIME type is required or recommended? On one hand, I >don't see why we need a new one because we are just using our own XML >schema to describe the Ontology, imports, etc. tags. Thus, it would seem >we could just use the XML MIME type. Certainly, the W3C doesn't require >a new MIME type for each schema? However, on the other hand, our >language does have special semantics that most XML schemas don't have, >and perhaps the MIME type is used to indicate to applications that they >should process it in a different way. This makes sense, but then it >seems to me, OWL should have its own MIME type regardless. After all, we >have a different semantics from RDF (even if it is just additive). So, >it seems to me either both proposals or neither require the new MIME >type, and I'm leaning toward both of them needing one. I'm not the expert on this stuff, I hope Dan Connolly or Massimo will correct me if I'm wrong -- but I think going to a separate mime type would require much more motivation than this. If you insist OWL can only be used through an XML schema, then I will point out this disagrees with the f2f decisions taken by the WG. If you say no, we want to be RDF parsable, then we need to go by RDF rules. I think the location of our metadata is not such an important issue that it is worth reopening the decisions (that's my opinion) > >Proposal 1 would require RDF tools to read a document twice to get >import information >-------------------------------------------------------------------------- >Any tools that care about import information would use an XML parser to >extract it, and then pass the RDF subtree of the document to an RDF >parser. >There's absolutely no reason to read the document twice. If the >application is a plain-old RDF application that doesn't realize this, >then it will never have heard of imports in the first place and won't >care about imports information. exactly - so maybe I'm now in favor of your solution because it means most people won't care about imports. >I can see that there is a slight cost in tools because now all of your >RDF tools need an extra 10 lines of code to write out proper OWL, but I >think that cost is negligible, because the OWL tools that users will >find easiest are those that have some built-in support for OWL. That is, >ontologies will be a central aspect (as opposed to just another class), >the "parseType:Collection" ugliness will be handled automatically for >you, etc. >In other words, in order for OWL to succeed, there will have to be OWL >specific tools anyway. well, many current DAML tools, including all of my group's open source stuff, are RDF tools to which some OWL has been added - goal is to get to at least OWL-Lite for all of them. So far we have been able to do this easily - changing the initial parsing, rather than the interpretation of the graph as we process it, would be a bigger change than you imply. I agree there will be some OWL specific tools, but if all OWL use requires OWL specific tools then I fear it will never catch on -- in fact, the great success of your SHOE system in gaining acceptance was your recognition thatit was important to keep it so HTML tools interoperated with it -- otherwise it would have just been some KR langauge on the web and gotten much of the same attention as several others that didn't get to the starting gate. OWL will gain immediate penetration from playing nice with RSS, RDF-XMP (Adobe's metadata system) and other existing RDF tools, and I fear that the import thing, being only somewhat defined as it is, is not worth breaking this over. >If people think it is important for RDF to process imports information, >then I suggest we ask RDF Core to consider extending RDF to handle it >correctly. This could be done by first allowing RDF to make statements >about graphs (as opposed to about a resource that we pretend represents >a document or a graph), and then adding an official imports property >that has a new thing called Graph as it's domain and range. We would >also need a way to give an identifier to a graph, which could probably >be done by adding an ID attribute in the <rdf:RDF> tag. I suspect we could discuss with RDF Core the idea of their being something in the <rdf:RDF> to help - perhaps an "RDF-Profile" or a pointer to an (optional) "RDF Header" graph - if it appears we need such mechanisms I'll be happy to bring that up in the SWCG - I'm not yet convinced we couldn't do this with RDF as is. > >Proposal #1 can't have instances and classes in the same document >------------------------------------------------------------------ >Not necessarily. Although my proposal said "class and property >definitions go here," there is no reason why instance statements >couldn't go in the same place, particularly if the instances were >important to the ontology. I don't follow your argument about having to >import the instances if you import the ontology. Why wouldn't you want >to? If someone decides the instances are part of the ontology, then when >you import it, you should import the whole ontology. Note, that in >proposal #2 the same thing is true, because there an imports means you >import the whole document. Thus, if you had classes and instances in the >document, you import both as well. OK, so how do I have instances in a separate document? Can it do an import? If not, how do I know what semantics to impart? If yes, are you saying they must be in an ontology definition? THat would definitely break a lot of tools that output RDF as triples - not as XML documents. >Proposal #2 will make it easier to convert DAML+OIL to OWL >------------------------------------------------------------ >This might be true to some extent, because I believe that as it stands >now, the conversion is simply a series of find and replaces so you could >do it all with a simple Unix script. However, I do not believe that >proposal #1 would require you to save a temporary file in order to do >the conversion. In the worst case, you'd have to do two reads of the >DAML+OIL document: one to collect up the ontology and imports >information, and one to create the OWL document with it in the right >place. However, since the DAML+OIL convention is to put the ontology >stuff at the top, I think a one pass program would work in most if not >all cases. Even so, since conversion tools only need to be used once per >document, the two pass algorithm isn't that expensive. well, we use a simple PERL script to import all sorts of things into OWL, and also have a python front end for RIC so we can read N3, and an RDF triples to Parka converter and a few other things that would have to be rewritten to either create documents or to do explicit imports -- but all those could be rewritten, or we could ignore imports... > >I look forward to your counter-arguments. I think this is a very useful >and important discussion. The real problem is I think you've missed the key argument - not having the imports statement in the graph means we cannot have non-document-based tools for handling OWL unless they ignore imports (which is actually okay with me). If we say the owl:ontology statements do go in the graph, then we can put them there. IF we say they don't, then we lose interoperability. Your approach cannot have it both ways -- you think you can because you're starting from the assumption that everything lives in documents -- but that isn't true - once my crawler grabs stuff and pulls it into ParkaSW, for example, all we keep around are the triples (including an extra one with a pointer back to an original document if we started from one). Mike Dean does the same with his Daml crawler [1] (5,000,000 DAML assertions found so far on 20k+ pages) -- the assertions go into his DAML DB, and thus you could not query for imports statements once things were in the graph -- unless he puts them there, in which case why don't we just do it in the first place? [1] http://www.daml.org/crawler/ -- Professor James Hendler hendler@cs.umd.edu Director, Semantic Web and Agent Technologies 301-405-2696 Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax) Univ of Maryland, College Park, MD 20742 240-731-3822 (Cell) http://www.cs.umd.edu/users/hendler
Received on Monday, 30 September 2002 12:27:08 UTC