- From: Jim Hendler <hendler@cs.umd.edu>
- Date: Wed, 2 Oct 2002 12:50:29 -0400
- To: webont <www-webont-wg@w3.org>
>Sender: heflin@EECS.Lehigh.EDU >Date: Wed, 02 Oct 2002 12:06:00 -0400 >From: Jeff Heflin <heflin@cse.lehigh.edu> >Organization: Lehigh University >X-Accept-Language: en >To: Jim Hendler <hendler@cs.umd.edu> >Subject: Re: LANG: owl:import - Two Proposals > >Jim, > >I think you sent this only to me, but I imagine you meant to send it to >the mailing list. If so, please do and I will respond there. > >Jeff thanks Jeff - all - here's a message I should have sent to you all - esp the call at the end for other people to weigh in on this issue! > >Jim Hendler wrote: >> >> >Please see my responses inline... >> > >> >Jim Hendler wrote: >> >> >> >> At 10:46 AM -0400 9/30/02, Jeff Heflin wrote: >> >> >Jim, >> >> > >> >> >Thanks for the arguments in favor of and against proposal 2. I think it >> >> >is important that all the pros and cons be identified and we have a >> >> >debate on this so that the WG can truly make the best decision, whether >> >> >that be in favor of proposal 1, 2 or something as yet undetermined. >> >> > >> >> >That said, I'd like to discuss your points: >> >> > >> >> >Proposal #1 requires a new MIME type >> >> >------------------------------------- >> >> >I find this an interesting point. Does the W3C have any documentation >> >> >that say when a new MIME type is required or recommended? On >>one hand, I >> >> >don't see why we need a new one because we are just using our own XML >> >> >schema to describe the Ontology, imports, etc. tags. Thus, it >>would seem >> >> >we could just use the XML MIME type. Certainly, the W3C doesn't require >> >> >a new MIME type for each schema? However, on the other hand, our >> >> >language does have special semantics that most XML schemas don't have, >> >> >and perhaps the MIME type is used to indicate to applications that they >> >> >should process it in a different way. This makes sense, but then it >> >> >seems to me, OWL should have its own MIME type regardless. >>After all, we >> >> >have a different semantics from RDF (even if it is just additive). So, >> >> >it seems to me either both proposals or neither require the new MIME >> >> >type, and I'm leaning toward both of them needing one. >> >> >> >> I'm not the expert on this stuff, I hope Dan Connolly or Massimo will >> >> correct me if I'm wrong -- but I think going to a separate mime type >> >> would require much more motivation than this. If you insist OWL can >> >> only be used through an XML schema, then I will point out this >> >> disagrees with the f2f decisions taken by the WG. If you say no, we >> >> want to be RDF parsable, then we need to go by RDF rules. I think >> >> the location of our metadata is not such an important issue that it >> >> is worth reopening the decisions (that's my opinion) >> > >> >I will await Dan or Massimo's response on the issue of MIME types. >> > >> >As for proposal #1, it does play by the RDF rules. As we've discussed, >> >even the RDF syntax documents say it is perfectly okay for the RDF to be >> >embedded in another XML document. Thus, I do not see this as going >> >against the WG's prior decision regarding RDF. I also do not think this >> >is just an issue of "where the metadata goes," I think it is a critical >> >issue about what can and cannot be expressed in RDF. >> >> Jeff -it my seem like I'm quibbling - but it is important. If we put >> the RDF in an XML document, then it is no longer a "RDF/XML" document >> (by definition) - it is an XML document. This violates the decision >> made by the WG earlier that our transfer protocol will be RDF/XML, >> and that XML will bw non-normative. I think reopening that issue is >> problematic at this late date, and the above would require the WG to >> rethink earlier decisions, which is a problem to me given we have an >> alternative. >> >> > > > >> >> >Proposal 1 would require RDF tools to read a document twice to get >> >> >import information >> >> >>>-------------------------------------------------------------------------- >> >> >Any tools that care about import information would use an XML parser to > > >> >extract it, and then pass the RDF subtree of the document to an RDF >> >> >parser. >> >> >There's absolutely no reason to read the document twice. If the >> >> >application is a plain-old RDF application that doesn't realize this, > > >> >then it will never have heard of imports in the first place and won't >> >> >care about imports information. >> >> >> >> exactly - so maybe I'm now in favor of your solution because it means >> >> most people won't care about imports. >> > >> >That's a risk I'd be willing to take. I think what the OWL specs say and >> >the OWL tools do will be more important to users than what some old RDF >> >tools do. So can we just settle this and go with Proposal #1? ;-) I >> >guess not, huh. >> >> how well Jeff knows me :-> >> >> > > >> >> >I can see that there is a slight cost in tools because now all of your >> >> >RDF tools need an extra 10 lines of code to write out proper OWL, but I >> >> >think that cost is negligible, because the OWL tools that users will >> >> >find easiest are those that have some built-in support for >>OWL. That is, >> >> >ontologies will be a central aspect (as opposed to just another class), >> >> >the "parseType:Collection" ugliness will be handled automatically for >> > > >you, etc. >> >> >In other words, in order for OWL to succeed, there will have to be OWL >> >> >specific tools anyway. >> >> >> >> well, many current DAML tools, including all of my group's open >> >> source stuff, are RDF tools to which some OWL has been added - goal >> >> is to get to at least OWL-Lite for all of them. So far we have been >> >> able to do this easily - changing the initial parsing, rather than >> >> the interpretation of the graph as we process it, would be a bigger >> >> change than you imply. I agree there will be some OWL specific >> >> tools, but if all OWL use requires OWL specific tools then I fear it >> >> will never catch on -- in fact, the great success of your SHOE system >> >> in gaining acceptance was your recognition thatit was important to >> >> keep it so HTML tools interoperated with it -- otherwise it would >> >> have just been some KR langauge on the web and gotten much of the >> >> same attention as several others that didn't get to the starting >> >> gate. OWL will gain immediate penetration from playing nice with >> >> RSS, RDF-XMP (Adobe's metadata system) and other existing RDF tools, >> >> and I fear that the import thing, being only somewhat defined as it >> >> is, is not worth breaking this over. >> > >> >Look if RDF had the penetration of XML, then I might agree with you. In >> >that case, maximum compatibility with existing RDF tools would be a >> >critical issue. But the fact is, when compared to XML, RDF is barely >> >even on the radar screen. If RDF really takes off, it will be because >> >people want to use OWL, and not the other way around. >> > >> >Still, I will grant you that there is some RDF data out there that it >> >might be nice to bring into the OWL world, and this could be seen as >> >minor con against proposal #1. However, if we went with that proposal, >> >we could develop an appendix about how to work with plain-old RDF data. >> >For example, we might say that you assume that any schema is an ontology >> >and that all RDF documents import schemas whose namespaces they use. >> > >> >BTW, your SHOE comparison doesn't actually work. Sure, HTML tools could >> >read SHOE pages without being adversely affected, and the same would go >> >for RDF tools reading OWL pages under proposal #1. In SHOE, I had to >> >create a whole suite of tools to do anything useful with the language, >> >and I certainly didn't get anything out of HTML pages that weren't >> >already marked up with SHOE. >> >> okay, analogy wasn't great - but I think you underestimate RDF use - >> however, that's an external issue. I think the important thing to me >> is that the ontologies and instances we create in OWL get pulled into >> a graph, that can be manipulated separately from using a reasoner on >> the ontology per se. We have LOTS of examples of this sort of OWL >> tool growing - and thus whether we say "RDF" per se, or RDF-like > > graphs, is indeed quibbling - the idea of saying you have to >> reprocess graphs to use them in OWL makes me nervous indeed. >> >> > >> >> >If people think it is important for RDF to process imports information, > > >> >then I suggest we ask RDF Core to consider extending RDF to handle it >> >> >correctly. This could be done by first allowing RDF to make statements >> >> >about graphs (as opposed to about a resource that we pretend represents >> >> >a document or a graph), and then adding an official imports property >> >> >that has a new thing called Graph as it's domain and range. We would >> >> >also need a way to give an identifier to a graph, which could probably >> >> >be done by adding an ID attribute in the <rdf:RDF> tag. >> >> >> >> I suspect we could discuss with RDF Core the idea of their being >> >> something in the <rdf:RDF> to help - perhaps an "RDF-Profile" or a >> >> pointer to an (optional) "RDF Header" graph - if it appears we need >> >> such mechanisms I'll be happy to bring that up in the SWCG - I'm not >> > > yet convinced we couldn't do this with RDF as is. >> > >> >Such an approach may alleviate many of my concerns with proposal #2. If >> >the group is leaning strongly that way, we may wish to investigate this >> >further. >> >> We can discuss this at the f2f - I'm hoping Dave Beckett will join us >> for at least the social events, and we (either you and I, or the WG) >> might wish to discuss this with him. >> >> > > > >> >> >Proposal #1 can't have instances and classes in the same document >> >> >------------------------------------------------------------------ >> >> >Not necessarily. Although my proposal said "class and property >> >> >definitions go here," there is no reason why instance statements >> >> >couldn't go in the same place, particularly if the instances were >> >> >important to the ontology. I don't follow your argument about having to >> >> >import the instances if you import the ontology. Why wouldn't you want >> >> >to? If someone decides the instances are part of the >>ontology, then when >> >> >you import it, you should import the whole ontology. Note, that in >> >> >proposal #2 the same thing is true, because there an imports means you >> > > >import the whole document. Thus, if you had classes and >>instances in the >> >> >document, you import both as well. >> >> >> >> OK, so how do I have instances in a separate document? Can it do an >> >> import? If not, how do I know what semantics to impart? If yes, are >> >> you saying they must be in an ontology definition? THat would >> >> definitely break a lot of tools that output RDF as triples - not as >> >> XML documents. >> > >> >If you look at Proposal #1, it does require an additional <owl:Data> tag >> >around RDF instances so that you can specify the imports information. So >> >you can either put the instances in an ontology (if they belong there) >> >or in a separate document. Now this does mean that existing RDF >> >documents would not be valid OWL, which I admitted above is an argument >> >against proposal #1, but as I said there, I think this can be mitigated >> >by saying how RDF data could be used in OWL. >> >> yes, but again I think this is moving backwards against previously >>WG decisions >> >> >As for tools outputing RDF as triples, I can't really speak to that >> >without knowing what tools your talking about. I admit I haven't used >> >many RDF tools, but I think most RDF parsers have something like an >> >RdfGraph object or data structure that contains the triples. It should >> >be easy enough to subclass this with (or embed in) an OwlGraph >> >object/data structure which has methods for retrieving imports >> >information. >> >> Jeff - we take RDF stuff (RDF, RDFS, OWL) and run it through the W3C >> validator to turn into Ntriples which are read into PARKA and >> accessible over the web (with Parka inferecing) - I don't see anyway >> to do this given the above without having to build new tools or build >> complex processes. By the way, one of the things we are playing with >> grabbing this way is the results of an Expose-based crawler, so we >> want to work on large numbers of triples with no preprocessing. > > >> > >> >> >Proposal #2 will make it easier to convert DAML+OIL to OWL >> >> >------------------------------------------------------------ >> >> >This might be true to some extent, because I believe that as it stands > > >> >now, the conversion is simply a series of find and replaces >so you could >> >> >do it all with a simple Unix script. However, I do not believe that >> >> >proposal #1 would require you to save a temporary file in order to do >> >> >the conversion. In the worst case, you'd have to do two reads of the >> >> >DAML+OIL document: one to collect up the ontology and imports >> >> >information, and one to create the OWL document with it in the right >> >> >place. However, since the DAML+OIL convention is to put the ontology >> >> >stuff at the top, I think a one pass program would work in most if not >> >> >all cases. Even so, since conversion tools only need to be >>used once per >> >> >document, the two pass algorithm isn't that expensive. >> >> >> >> well, we use a simple PERL script to import all sorts of things into >> >> OWL, and also have a python front end for RIC so we can read N3, and >> >> an RDF triples to Parka converter and a few other things that would >> >> have to be rewritten to either create documents or to do explicit >> > > imports -- but all those could be rewritten, or we could ignore >> >> imports... >> > >> >I don't see what the problem is. OWL is an evolving language. Once it is >> >set in stone, we'll all have to modify our tools to work with it, >> >regardless of whether or not it has imports. >> >> I agree that this was a minor issue - it's why I had it in a "p.s." >> instead of in the main body of my first response - I agree to ignore >> this as an important issue about OWL for now... >> >> > > > >> >> >I look forward to your counter-arguments. I think this is a very useful >> >> >and important discussion. >> >> >> >> The real problem is I think you've missed the key argument - not >> >> having the imports statement in the graph means we cannot have >> >> non-document-based tools for handling OWL unless they ignore imports >> >> (which is actually okay with me). If we say the owl:ontology >> >> statements do go in the graph, then we can put them there. IF we say >> >> they don't, then we lose interoperability. Your approach cannot have >> >> it both ways -- you think you can because you're starting from the >> >> assumption that everything lives in documents -- but that isn't true >> >> - once my crawler grabs stuff and pulls it into ParkaSW, for example, >> >> all we keep around are the triples (including an extra one with a >> >> pointer back to an original document if we started from one). Mike >> >> Dean does the same with his Daml crawler [1] >> >> (5,000,000 DAML assertions found so far on 20k+ pages) -- the >> >> assertions go into his DAML DB, and thus you could not query for >> >> imports statements once things were in the graph -- unless he puts >> >> them there, in which case why don't we just do it in the first place? >> >> >> > >> >No, I don't think I missed the argument. I just have a different idea of >> >what it means to parse an OWL document. You think the only result is a >> >set of RDF triples. I think the results is a data structure which >> >consists mostly of RDF triples, but there might be a few extra >> >components as well (such as imports information or your pointer back to >> >the original document). If you are storing this in a database (or Parka >> >for that matter), then you might have one or more tables for storing the >> >RDF triples and then another "meta"-table for the imports information >> >(this is basically what I did with SHOE in Parka). If you then want to >> >exchange this information with another application then you should >> >either use the OWL presentation syntax or define custom data structures >> >and/or file formats that preserve all relevant aspects of the language, >> >whatever they may be. >> >> i.e. we should lose interoperability, defeating the whole purpose of >> moving towards standards >> >> > >> >Look, I'm really not trying to be difficult here. I'm not on some > > >anti-RDF crusade, although I'll admit I'm not a big fan of the language. >> >In [1], I listed what I consider to be a number of problems with >> >proposal #2. I haven't heard anyone address any of these concerns. If > > >these were addressed satisfactorily (perhaps by an alternate proposal) >> >then I would be happy to endorse that approach. >> >> fair enough, but I thought those were valid disadvantages, just not >> as important as the disadvantages of proposal 1. >> >> >Jeff >> > >> >[1] http://lists.w3.org/Archives/Public/www-webont-wg/2002Sep/0473.html >> >> -- >> Professor James Hendler hendler@cs.umd.edu >> Director, Semantic Web and Agent Technologies 301-405-2696 >> Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax) >> Univ of Maryland, College Park, MD 20742 240-731-3822 (Cell) >> http://www.cs.umd.edu/users/hendler -- Professor James Hendler hendler@cs.umd.edu Director, Semantic Web and Agent Technologies 301-405-2696 Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax) Univ of Maryland, College Park, MD 20742 240-731-3822 (Cell) http://www.cs.umd.edu/users/hendler
Received on Wednesday, 2 October 2002 12:51:01 UTC