- From: ben syverson <w3@likn.org>
- Date: Sat, 5 Mar 2005 00:23:36 -0600
- To: semantic-web@w3.org
- Cc: w3@likn.org
Hello, The last time I worked with metadata on the web was with MCF files almost ten years ago, but now I'm very anxious to dive back in. To that end, I've developed a nice juicy project to work on to help me sort through and understand all the issues involved. The project is named "likn," which is a sort of head-on collision between "liken" and "link." The project itself is sort of a wiki-ish system which is syndicated in a zillion different ways, and which collects and maintains a boatload of metadata with an associated dynamic ontology. More specifically, it will be an open-source mod_perl application which supports "solo" posts and public wiki-like documents, and an associated chatbot which asks and answers questions about nodes and their relationships. The system outputs XHTML, RSS, RDF and OWL descriptions of the data and relationships contained in it. Every node is syndicated, so if a node is a Class, its RSS will reflect any new sub-classes or instances (eg, if you subscribe to "citrus," you'll get notification any time the node is edited or replied to, as well as when someone adds "mandarin orange" and classifies it as a type of citrus). Because likn will be generating vast amounts of metadata and building ontological information on the fly, I want to make sure it will have a very positive ecological impact in terms of the SW. In that vein, there are several things that I have immediate questions about. Please bear with me if my questions are naive... 1) Each installation of the software will be building its own ontology as more information is added. The chatbot recognizes and happily digests statements such as "a person can only have one mother." Thus, the site's ontology is not fixed and carefully crafted, but public, not fully trustworthy, and ever evolving, which is, in my opinion, The Way It Should Be. The problem is that I'm concerned that this might violate the spirit of OWL; it's my understanding that OWL ontologies are meant to be stable, versioned and reusable, in the hopes that people will share or merge standard versions of them. It's of course possible to share and merge a dynamic ontology, but it must be done with the understanding that the constraints and statements made are suspect and in-flux, and ideally the reasoner should be able to understand how often it should check for new versions (either through something like sy:updateFrequency or through its own cache rules and a "Last-Modified" field). Because eventually, someone is likely to tell likn "a person can have more than one mother, but only one birth mother." (1.a) One workaround is to describe the constraints and relationship types in plain RDF and not use OWL at all. But then I'm using a non-standard and homebrew method of describing the ontology, when the whole point is to facilitate interchange. 2) Does anyone have any philosophical objections to using OWL Full to liberally allow Classes as Property Values? I read <http://www.w3.org/TR/swbp-classes-as-values/> with great interest, and would like to allow many relationships to form using the model described in Approach 1. I want to be able to preserve the ability to have the following exchange, without resorting to hackery such as intermediary nodes like "LionSubject": me: Lions: Life in the Pride's subject is Lions. likn: I assume you mean its subject is 'Lion?' me: Yup. Now tell me about lion. likn: Lion is a type of Animal, and is the subject of the book 'Lions: Life in the Pride.' .... In short, is there any good reason to explicitly separate Classes from Property Values, when it makes so much sense not to? 3) There's the obvious issue of duplication -- one of the most attractive aspects of a shared ontology is that you don't have to repeat someone else's work, but that's exactly what likn asks its users to do. Someone may have developed a beautiful ontology to describe food, but because a likn installation may service a community with its own definitions of the same terms and their relationships, we can't directly use other ontologies. Within an installation, likn is an open, free-linking system, but to the outside world, it's a "Push" provider of data. You can utilize a likn ontology outside of likn, but it would only really be useful for examining data from that particular likn colony -- you wouldn't want to rely in your own application on its description of "star wars," for example, for fear that its definition could change from the movie to the Reagan proposal. So at first blush, publishing likn ontologies seems useless to anyone -- but then I can imagine a third party developing (for example) a really amazing OWL-based search engine, which could be very useful for finding things in likn colonies. 4) One possibility is to allow the recognition/merging of other ontologies, but qualify their use within likn. For instance: me: tell me about dog likn: 'dog' is a type of animal, but according to AnimalNet, dog is a type of 'mammal.' Which is all well and good, but what if you want to create equivalences? If you want to say that our 'dog' is equal to AnimalNet's 'dog,' now anyone asking about dog gets something like: likn: 'dog' is a type of animal and a type of mammal. me: what's a mammal? likn: I don't know, but according to AnimalNet, mammal is a type of animal. Now we have two rivaling definitions of 'animal'. Likn could be smart enough to ignore redundant statements (given the two statements "ben is an instance of programmer" and "ben is an instance of person," likn will favor the more specific type of person), it can't (or shouldn't) automatically infer that AnimalNet's 'animal' is equivalent to our 'animal,' because our likn colony could in fact be a Muppet fansite, and 'Animal' could talk very specifically about the character of the same name (although in that case, no one would assert that 'dog' is type of animal). So things get very confusing and messy. Is there a good/established/proposed way of handling this? Possibly through reification? 5) One aspect of the app is that users can vote on assertions. So if three people agree that "ben is an instance of person" and one person disagrees, likn is 75% sure that ben is a person. Is it best to do just do this via a reified statement such as the following? <rdf:Description> <rdf:subject rdf:resource="http://likn.org/dog" /> <rdf:predicate rdf:resource="http://likn.org/footType" /> <rdf:object rdf:resource="http://likn.org/paw" /> <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement" /> <likn:confidence>75</likn:confidence> </rdf:Description> 6) Does anyone have any input, guidance or problems with my general approach, or specific aspects? Anyway, thanks in advance -- and hello! - ben syverson
Received on Saturday, 5 March 2005 08:14:05 UTC