- From: ben syverson <w3@likn.org>
- Date: Sat, 5 Mar 2005 00:23:36 -0600
- To: semantic-web@w3.org
- Cc: w3@likn.org
Hello,
The last time I worked with metadata on the web was with MCF files
almost ten years ago, but now I'm very anxious to dive back in. To that
end, I've developed a nice juicy project to work on to help me sort
through and understand all the issues involved. The project is named
"likn," which is a sort of head-on collision between "liken" and
"link."
The project itself is sort of a wiki-ish system which is syndicated in
a zillion different ways, and which collects and maintains a boatload
of metadata with an associated dynamic ontology. More specifically, it
will be an open-source mod_perl application which supports "solo" posts
and public wiki-like documents, and an associated chatbot which asks
and answers questions about nodes and their relationships. The system
outputs XHTML, RSS, RDF and OWL descriptions of the data and
relationships contained in it. Every node is syndicated, so if a node
is a Class, its RSS will reflect any new sub-classes or instances (eg,
if you subscribe to "citrus," you'll get notification any time the node
is edited or replied to, as well as when someone adds "mandarin orange"
and classifies it as a type of citrus).
Because likn will be generating vast amounts of metadata and building
ontological information on the fly, I want to make sure it will have a
very positive ecological impact in terms of the SW. In that vein, there
are several things that I have immediate questions about. Please bear
with me if my questions are naive...
1) Each installation of the software will be building its own ontology
as more information is added. The chatbot recognizes and happily
digests statements such as "a person can only have one mother." Thus,
the site's ontology is not fixed and carefully crafted, but public, not
fully trustworthy, and ever evolving, which is, in my opinion, The Way
It Should Be. The problem is that I'm concerned that this might violate
the spirit of OWL; it's my understanding that OWL ontologies are meant
to be stable, versioned and reusable, in the hopes that people will
share or merge standard versions of them. It's of course possible to
share and merge a dynamic ontology, but it must be done with the
understanding that the constraints and statements made are suspect and
in-flux, and ideally the reasoner should be able to understand how
often it should check for new versions (either through something like
sy:updateFrequency or through its own cache rules and a "Last-Modified"
field). Because eventually, someone is likely to tell likn "a person
can have more than one mother, but only one birth mother."
(1.a) One workaround is to describe the constraints and
relationship types in plain RDF and not use OWL at all. But then I'm
using a non-standard and homebrew method of describing the ontology,
when the whole point is to facilitate interchange.
2) Does anyone have any philosophical objections to using OWL Full to
liberally allow Classes as Property Values? I read
<http://www.w3.org/TR/swbp-classes-as-values/> with great interest, and
would like to allow many relationships to form using the model
described in Approach 1. I want to be able to preserve the ability to
have the following exchange, without resorting to hackery such as
intermediary nodes like "LionSubject":
me: Lions: Life in the Pride's subject is Lions.
likn: I assume you mean its subject is 'Lion?'
me: Yup. Now tell me about lion.
likn: Lion is a type of Animal, and is the subject of the book 'Lions:
Life in the Pride.'
....
In short, is there any good reason to explicitly separate Classes from
Property Values, when it makes so much sense not to?
3) There's the obvious issue of duplication -- one of the most
attractive aspects of a shared ontology is that you don't have to
repeat someone else's work, but that's exactly what likn asks its users
to do. Someone may have developed a beautiful ontology to describe
food, but because a likn installation may service a community with its
own definitions of the same terms and their relationships, we can't
directly use other ontologies. Within an installation, likn is an open,
free-linking system, but to the outside world, it's a "Push" provider
of data. You can utilize a likn ontology outside of likn, but it would
only really be useful for examining data from that particular likn
colony -- you wouldn't want to rely in your own application on its
description of "star wars," for example, for fear that its definition
could change from the movie to the Reagan proposal. So at first blush,
publishing likn ontologies seems useless to anyone -- but then I can
imagine a third party developing (for example) a really amazing
OWL-based search engine, which could be very useful for finding things
in likn colonies.
4) One possibility is to allow the recognition/merging of other
ontologies, but qualify their use within likn. For instance:
me: tell me about dog
likn: 'dog' is a type of animal, but according to AnimalNet, dog is a
type of 'mammal.'
Which is all well and good, but what if you want to create
equivalences? If you want to say that our 'dog' is equal to AnimalNet's
'dog,' now anyone asking about dog gets something like:
likn: 'dog' is a type of animal and a type of mammal.
me: what's a mammal?
likn: I don't know, but according to AnimalNet, mammal is a type of
animal.
Now we have two rivaling definitions of 'animal'. Likn could be smart
enough to ignore redundant statements (given the two statements "ben is
an instance of programmer" and "ben is an instance of person," likn
will favor the more specific type of person), it can't (or shouldn't)
automatically infer that AnimalNet's 'animal' is equivalent to our
'animal,' because our likn colony could in fact be a Muppet fansite,
and 'Animal' could talk very specifically about the character of the
same name (although in that case, no one would assert that 'dog' is
type of animal). So things get very confusing and messy. Is there a
good/established/proposed way of handling this? Possibly through
reification?
5) One aspect of the app is that users can vote on assertions. So if
three people agree that "ben is an instance of person" and one person
disagrees, likn is 75% sure that ben is a person. Is it best to do just
do this via a reified statement such as the following?
<rdf:Description>
<rdf:subject rdf:resource="http://likn.org/dog" />
<rdf:predicate rdf:resource="http://likn.org/footType" />
<rdf:object rdf:resource="http://likn.org/paw" />
<rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement" />
<likn:confidence>75</likn:confidence>
</rdf:Description>
6) Does anyone have any input, guidance or problems with my general
approach, or specific aspects?
Anyway, thanks in advance -- and hello!
- ben syverson
Received on Saturday, 5 March 2005 08:14:05 UTC