$)CRE: WordNet in RDF/XML: 50,000+ RDF class vocabulary...

This seems to have stopped working but it worked great before. Any chance
to get it going for a little while looonger?

Reference to undeclared namespace prefix: 'r'. Line 14, Position 51 
<Class r:about="http://xmlns.com/wordnet/1.6/cat

Regards,
Richard Humpleman.

>>> -----Original Message-----
>>> From: Dan Brickley [mailto:danbri@w3.org]
>>> Sent: Thursday, December 02, 1999 5:21 PM
>>> To: www-rdf-interest@w3.org
>>> Cc: wordnet@princeton.edu
>>> Subject: WordNet in RDF/XML: 50,000+ RDF class vocabulary...
>>> 
>>> 
>>> 
>>> 
>>> RDF IG,
>>> 
>>> I've been trying and failing to find time to write up my WordNet/RDF
>>> experiments. Instead, I thought I'd post as-is what I currently have
>>> working. Code to follow after minor cleanup.
>>> 
>>> Context:
>>> WordNet is a large lexical database, consisting of 10s of 
>>> thousands of
>>> commonsense English concepts. The WordNet site contains a 
>>> wealth more
>>> information, including links to WordNet's use in the information
>>> retrieval and digital library community, as well as to 
>>> spin-offs like
>>> EuroWordNet, which maps the WordNet vocabulary to 
>>> non-English languages.
>>> See http://cogsci.princeton.edu/~wn/
>>> 
>>> 
>>> I'm very interested in the potential of WordNet for 'semantic web'
>>> applications, not least because the data is available 
>>> unencumbered for
>>> commercial and noncommercial use. So, I spent a little time 
>>> thinking about
>>> how WordNet can be mapped into RDF. There appears to be a 
>>> trivial mapping
>>> from the 'noun' portion of the WordNet database to a 
>>> hierarchy of RDF
>>> classes. I've not investigated models for representing the 
>>> other aspects
>>> of WordNet
>>> yet.
>>> 
>>> Here's an example of the output from a commandline version:
>>> 
>>> 
>>> 	[danbri]% wn tree -hypen|more
>>> 	
>>> 	Synonyms/Hypernyms (Ordered by Frequency) of noun tree
>>> 
>>> 	2 senses of tree
>>> 
>>> 	Sense 1
>>> 	tree
>>> 	       => woody plant, ligneous plant
>>> 	           => vascular plant, tracheophyte
>>> 	               => plant, flora, plant life
>>>         	           => life form, organism, being, living thing
>>>         	               => entity, something
>>> 
>>> 	Sense 2
>>> 	tree, tree diagram
>>> 	       => plane figure, two-dimensional figure
>>> 	           => figure
>>> 	               => shape, form
>>> 	                   => attribute
>>> 	                       => abstraction
>>> 	
>>> 
>>> 
>>> Each 'word sense' in WordNet's collection of nouns can, I believe,
>>> simply be mapped into RDF's notion of a class. For eg., 
>>> 'tree' in sense
>>> one above would be the class of all trees (ie. a subset of 
>>> all the woody
>>> plants).
>>> 
>>> If we give URIs to these classes, eg.
>>> 
>>> 	http://snowball.ilrt.bris.ac.uk/xmlns/wordnet/noun/tree~1
>>> 
>>> we can use them as an RDF vocabulary, and represent the 
>>> wordnet hierarchy
>>> as sub-class relationships.
>>> 
>>> I've rigged up a simple prototype (a tiny Perl CGI script) 
>>> which wraps
>>> WordNet in a WWW interface such that, given a term and a 
>>> sense number
>>> (eg. 'tree' sense '1') it returns an RDF description of 
>>> that part of the
>>> WordNet type hierarchy. The particular strategy I adopted 
>>> (which you can
>>> see if you look at
>>> http://snowball.ilrt.bris.ac.uk/xmlns/wordnet/noun/woody_plant~1
>>> or other URIs on my test server) is for a class URI to 
>>> dereference to a
>>> sparse description of the superclasses and a verbose 
>>> description of the
>>> immediate subclasses. I suspect this is back to front.
>>> 
>>> Anyway, comments welcomed. See the official wordnet site for a
>>> human-oriented HTML forms interface to the dataset, or 
>>> simply guess URLs
>>> for my server (if you guess a word not in the database, you 
>>> get an empty
>>> RDF graph).
>>> 
>>> more examples:
>>> http://snowball.ilrt.bris.ac.uk/xmlns/wordnet/noun/cat~1
>>> http://snowball.ilrt.bris.ac.uk/xmlns/wordnet/noun/cat~2 
>>> (ie. sense 2 of cat)
>>> http://snowball.ilrt.bris.ac.uk/xmlns/wordnet/noun/geek~1
>>> 
>>> If there were an agreed URI for WordNet, instance data 
>>> could look like
>>> this... 
>>> 
>>> <!-- using rdf, dublin core and wordnet namespaces -->
>>> <rdf:Description>
>>> 	<WordNet:bitmap~1 rdf:about="">  
>>> 		<dc:subject>
>>> 		<WordNet:geek~1" 
>>> rdf:about="http://purl.org/people/danbri"/>
>>> 		</dc:subject>	
>>> 	</WordNet:bitmap~1>
>>> </rdf:Description>
>>> 
>>> This says, 'this object is a member of the class of 
>>> bitmaps; it has at its
>>> subject another object of type 'geek', whose URI is 
>>> (etc...). So we might
>>> immediately think about using WordNet inside multimedia content,
>>> PNG/JPEG/GIF etc to improve accessibility and searchability of the
>>> content.
>>> 
>>> The RDF type hierarchy I exposed tells us in RDF that 
>>> bitmaps are a kind
>>> of picture which are a kind of representation etc., and 
>>> gives simple 
>>> definitions for each (eg. "an image represented as a two 
>>> dimensional array
>>> of brightness values for pixels"). Similarly for geeks 
>>> being kinds of
>>> persons etc...
>>> 
>>> I think there are a few glitches in my online demo, but it should be
>>> enough to give a flavour of the possibilities.
>>> 
>>> 
>>> Comments, suggestions etc welcomed,
>>> 
>>> Dan
>>> 
>>> 
>>> --
>>> danbri@w3.org
>>> 
>>> 

Received on Thursday, 9 December 1999 18:57:20 UTC