- From: Danny Ayers <danny.ayers@gmail.com>
- Date: Sat, 5 Mar 2005 14:01:51 +0100
- To: ben syverson <w3@likn.org>
- Cc: semantic-web@w3.org
On Sat, 5 Mar 2005 00:23:36 -0600, ben syverson <w3@likn.org> wrote: > The project itself is sort of a wiki-ish system which is syndicated in > a zillion different ways, and which collects and maintains a boatload > of metadata with an associated dynamic ontology. More specifically, it > will be an open-source mod_perl application which supports "solo" posts > and public wiki-like documents, and an associated chatbot which asks > and answers questions about nodes and their relationships. Sounds great! There are a few bots around #swig that may be of interest, like julie. (Incidentally the sembot thing is another resident lurker on my own to-do list). The system > outputs XHTML, RSS, RDF and OWL descriptions of the data and > relationships contained in it. Every node is syndicated, so if a node > is a Class, its RSS will reflect any new sub-classes or instances (eg, > if you subscribe to "citrus," you'll get notification any time the node > is edited or replied to, as well as when someone adds "mandarin orange" > and classifies it as a type of citrus). Like Phil, I'll be interested to hear how you intend to do this. > Because likn will be generating vast amounts of metadata and building > ontological information on the fly, I want to make sure it will have a > very positive ecological impact in terms of the SW. In that vein, there > are several things that I have immediate questions about. Please bear > with me if my questions are naive... > > 1) Each installation of the software will be building its own ontology > as more information is added. The chatbot recognizes and happily > digests statements such as "a person can only have one mother." Nice. Thus, > the site's ontology is not fixed and carefully crafted, but public, not > fully trustworthy, and ever evolving, which is, in my opinion, The Way > It Should Be. For an application like this, agreed 100%. The ecosystem should support a whole spectrum of trustworthiness, including bits related to ontologies. The problem is that I'm concerned that this might violate > the spirit of OWL; it's my understanding that OWL ontologies are meant > to be stable, versioned and reusable, in the hopes that people will > share or merge standard versions of them. Hmm, I think there's at least two angles to that - if something's reasonably stable per-version then it's going to be properly reusable. But that doesn't mean it has to be totally rigid over time, in particular more statements relating to the ontology may be added in just the same way that statements relating to instance data may be added. Again I think there's a bit of app-specificity in that for some purposes rigidity is desirable to maintain some kind of correctness, (like avoiding the apparently common confusion between the terms "traveller" and "terrorist"), at other times it may just misplace something in their blog index (and not land anyone in jail). Whatever, it's probably worth considering whether some kind of proof mechanism can be used, so if you do wind up with unexpected conclusions you can backtrack to their source. I'm beginning to get the feeling that SemWeb development is a little hampered by assumptions from other languages, especially those around XML & relational DB schema. Much of the power of RDF/OWL comes from the flexibility, and we shouldn't be frightened of (for example) creating many and/or huge ontologies and only using a tiny fraction of the available terms. Assuming the modelling could go either way, putting something in the Class system rather than using instances (or even literals) means there's more potentially for reasoning. Classes and properties are cheap! In other words, if I'm not entirely happy with http://purl.org/stuff/pets#Cat then I shouldn't hesitate to define a new term, say http://dannyayers.com/2005/05/pets#Cat. I could maintain desirable semantics by the new term as a subclass of the old one, or whatever. If I don't need to make any modifications, ok, I've got a duplicated term. But this is still reusing the existing ontology, and the cost shouldn't be to great. It may take more convoluted inference to get answers, but I reckon that flexibility to facilitate model "fitness" is more important than religious direct reuse and/or bending things for the sake of performance - that seems like premature optimisation. Until fairly recently I would have encouraged the direct-reuse approach to vocabularies, but there are definitely circumstances where this doesn't go down too well and might even be counter-productive. (Case in point being Atom - overall there's been a great desire to create things from scratch, even if that meant a lot of wheel reinvention. But there's no real net loss, as relationships with other vocabularies can be identified later, independently). It's of course possible to > share and merge a dynamic ontology, but it must be done with the > understanding that the constraints and statements made are suspect and > in-flux, and ideally the reasoner should be able to understand how > often it should check for new versions (either through something like > sy:updateFrequency or through its own cache rules and a "Last-Modified" > field). Because eventually, someone is likely to tell likn "a person > can have more than one mother, but only one birth mother." > (1.a) One workaround is to describe the constraints and > relationship types in plain RDF and not use OWL at all. But then I'm > using a non-standard and homebrew method of describing the ontology, > when the whole point is to facilitate interchange. In the context of RDF/OWL, I don't think homebrew and interchange are mutually exclusive - if anything, hopefully they'll be complementary (my homebrew can connect to your homebrew, thus connecting the interwiki upper ontology you reference to the standard furry quadruped vocabulary I use...). > 2) Does anyone have any philosophical objections to using OWL Full to > liberally allow Classes as Property Values? I read > <http://www.w3.org/TR/swbp-classes-as-values/> with great interest, and > would like to allow many relationships to form using the model > described in Approach 1. I want to be able to preserve the ability to > have the following exchange, without resorting to hackery such as > intermediary nodes like "LionSubject": > me: Lions: Life in the Pride's subject is Lions. > likn: I assume you mean its subject is 'Lion?' > me: Yup. Now tell me about lion. > likn: Lion is a type of Animal, and is the subject of the book 'Lions: > Life in the Pride.' > .... > In short, is there any good reason to explicitly separate Classes from > Property Values, when it makes so much sense not to? Wow, I got deja vu on that one, I must have asked the same question myself in the recent past. It's not very explicit in that doc, but all things being equal it's not so much a philosophical question as a computational one. If you start treating classes as individuals then it makes inference that much more complex. This may not be a problem if you're just using the graph model aspect of RDF, or wiring up your own app-specific reasoner, but if you're wanting to plug in an off-the-shelf DL engine then you'll have to abide by the language's constraints. > 3) There's the obvious issue of duplication -- one of the most > attractive aspects of a shared ontology is that you don't have to > repeat someone else's work, but that's exactly what likn asks its users > to do. Someone may have developed a beautiful ontology to describe > food, but because a likn installation may service a community with its > own definitions of the same terms and their relationships, we can't > directly use other ontologies. Within an installation, likn is an open, > free-linking system, but to the outside world, it's a "Push" provider > of data. You can utilize a likn ontology outside of likn, but it would > only really be useful for examining data from that particular likn > colony -- you wouldn't want to rely in your own application on its > description of "star wars," for example, for fear that its definition > could change from the movie to the Reagan proposal. So at first blush, > publishing likn ontologies seems useless to anyone -- but then I can > imagine a third party developing (for example) a really amazing > OWL-based search engine, which could be very useful for finding things > in likn colonies. This all sounds reasonable, but I would suggest that in a situation like this a little indirection is probably desirable. The approach I've ended up using with similar stuff is to partition up the vocab/ontology space. So a certain vocabulary may contain more generally shared definitions (e.g. WordNet) but then I might have corresponding terms in the vocabulary I'm using in the context of a particular project, or even in the context of my personal blog. If each of these is maintained in a separate namespace, then there's much more flexibility for interconnection. I may begin by asserting that "Pet" on my blog is an equivalent class to "Pet" in WN. But then my usage may drift, and so I can shift a gear to make raw:Pet more or less general than wn:Pet (i.e. raw:Pet rdfs:subclassOf wn:Pet or wn:Pet rdfs:subClassOf raw:Pet, rather than the bidirectional relationship of equivalence) . Ok, this assumes that the Sem Web won't remember the first assertion, but at this point in time that seems a fair pragmatic assumption, and when you're looking at local reasoning that's pretty easy to arrange. I suppose the ontologies should be versioned and annotated as cleanly as possible, but until you need to hook into other caches/triplestores which remember your earlier assertions on the Web there shouldn't be too many problems (datestamped annotations are probably a good idea). There's additional help when it comes to more, errm, humanish terminology (great for blogs and Wikis) in the form of SKOS, which allows for less tightly-bound relationships between terms without throwing away all the reasoning potential. I reckon MortenF's FOAF Output Plugin for WordPress is a real shining light here, gluing simple tagging "folksonomies" (yerch, horrid contraction) to formal knowledge representation, all without any extra user input beyond the simple install. Just my €0.02. Anyhow be sure and let us know how you get on. Cheers, Danny.
Received on Saturday, 5 March 2005 13:01:56 UTC