Re: A long but hopefully interesting introduction

Hi Ben,

This sounds like an interesting project, if a bit ambitious!

ben syverson writes:
 > 
 > Hello,
 > 
 > The last time I worked with metadata on the web was with MCF files 
 > almost ten years ago, but now I'm very anxious to dive back in. To that 
 > end, I've developed a nice juicy project to work on to help me sort 
 > through and understand all the issues involved. The project is named 
 > "likn," which is a sort of head-on collision between "liken" and 
 > "link."
 > 
 > The project itself is sort of a wiki-ish system which is syndicated in 
 > a zillion different ways, and which collects and maintains a boatload 
 > of metadata with an associated dynamic ontology. More specifically, it 
 > will be an open-source mod_perl application which supports "solo" posts 
 > and public wiki-like documents, and an associated chatbot which asks 
 > and answers questions about nodes and their relationships. The system 
 > outputs XHTML, RSS, RDF and OWL descriptions of the data and 
 > relationships contained in it. Every node is syndicated, so if a node 
 > is a Class, its RSS will reflect any new sub-classes or instances (eg, 
 > if you subscribe to "citrus," you'll get notification any time the node 
 > is edited or replied to, as well as when someone adds "mandarin orange" 
 > and classifies it as a type of citrus).
 > 

Out of interest, how do you intend mapping the terms people use in the
posts to URIs in RDF? 

I ask because this is a problem I've been grappling with for a
while. It's easy to build a mapping tool (I wrote an editor which
finds possible URI matches as you type), but difficult to build one
that is trivial for users to use and understand.

E.g. how do you generate URIs? How do you disambiguate words with the
same spelling (sleeper, sleeper and sleeper)?. How do you disambiguate
senses of the same word  (myserver1a the server, myserver1a the dnsname).  
RDF solves this problem by requiring that the author generates
seperate URIs for each sense/meaning, but this doesnt map well to a
user experience.

 > 
 > 5) One aspect of the app is that users can vote on assertions. So if 
 > three people agree that "ben is an instance of person" and one person 
 > disagrees, likn is 75% sure that ben is a person. Is it best to do just 
 > do this via a reified statement such as the following?
 > <rdf:Description>
 >       <rdf:subject rdf:resource="http://likn.org/dog" />
 >       <rdf:predicate rdf:resource="http://likn.org/footType" />
 >       <rdf:object rdf:resource="http://likn.org/paw" />
 >       <rdf:type 
 > rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement" />
 >       <likn:confidence>75</likn:confidence>
 > </rdf:Description>
 > 

To be honest, this sounds like you might get away with making it an
internal thing - I'd start by building your internal datastructures to
support the application, and then worry about mapping to RDF
later. (RDF is very clumsy for certain things - reification and
ordered collections are two of them)

 > 6) Does anyone have any input, guidance or problems with my general 
 > approach, or specific aspects?
 > 

If it's any interest to you, I'm currently experimenting with an RDF
like model without the URIs (using tags instead of URIs). It trades
simplicity for increased ambiguity. I'm experimenting with UI and
statistical methods for disambiguation.

http://www.phildawes.net/blog/category/semantic-web/tagtriples/

There's some software to play with here:
http://phildawes.net/temporary/tagtriples/tag/FrenchHorn
although it's a bit out of date - I'll upload some new code soon.
(if it 500s, just refresh - there's a problem with the python binary
on the box)

Cheers,

Phil

Received on Saturday, 5 March 2005 10:59:51 UTC