- From: Dan Connolly <connolly@w3.org>
- Date: Thu, 23 Feb 2006 09:46:31 -0600
- To: Mark van Assem <mark@cs.vu.nl>
- Cc: public-swbp-wg@w3.org
On Wed, 2006-02-22 at 15:20 +0100, Mark van Assem wrote: > Hi Dan, > > > But many of the URIs in the wn-conversion document don't work. I get > > 404 @ http://wordnet.princeton.edu/wn20/bank-noun-1/ > > The URIs don't work yet because we do not have a place yet to host WN > RDF/OWL. The idea is that Princeton implements server rewrite rules so > that HTTP GETs are redirected to a server with the actual data run by > an institute from our community to take that burden off of Princeton. > Also because the correct statements to be returned (our proposal is > the Concise Bounded Description) should be computed. Additionally, it > allows us to introduce a "latest version URL": > > http://wordnet.princeton.edu/wn > > that always redirects to the newest version (at another server), much > like the latest version URLs of W3C documents. > > Before we seek assistance to actually implement this we would like > some feedback on this approach. What is your opinion on this? Hmm... it seems fairly reasonable; but I don't recommend computing the CBD on-demand. I expect "baking" will be more manageable than "frying". cf http://www.aaronsw.com/weblog/000404 I wonder if the demand will really be so high that a simple web server with a pile of static files won't be able to handle it easily. Rather than putting big iron behind this service, I suggest you throttle it. Aggressively advocate that tools that know that they will rely on access to wordnet data in advance cache the data they need, and if anybody is making, say, more than 100 requests per hour, start returning "401 unauthorized; get a cache" and if the server gets busy, just return "5xx I'm too busy; you might try the _bittorrent bulk download_". If you have big iron to throw at the problem, you might as well do a full SPARQL service, and not just CBDs. See http://esw.w3.org/topic/DawgShows for several examples. In the cwm-related research work, we have been working on structures for navigating big databases; when you GET the database resource, the idea is that it comes back with "this is a summary of the database, not the whole contents; you can query it with SPARQL at <endpoint-xyz>". I can't advocate that as a tried-and-true best practice yet, but I hope to see something like it standardized eventually. > > and I'm quite surprised to see: > > > > The first step in using this conversion is selecting the > > appropriate version to download. > > > > Download? Can't I just use it there in the web? > > As for the download statement: in that introductory "primer" part of > the document we would like to describe as straightforward and simple > as possible how one could start to work with WordNet RDF/OWL (the > minimum amount of text for people already familiar with WN and > RDF/OWL). To keep it simple we only tell the story there for offline > use. We could add something along the lines of "one can also query > WordNet online..." and provide a reference to the more elaborate > online/offline section [1]. Would that be satisfactory? Well, no, not really. The simplest use doesn't involve downloading anything. I don't think the "best practices" WG should give the impression that downloading is the simplest case. The simplest case is that I just dereference the URIs of whatever terms I'm interested in. If you're only going to document usage that involves downloading, please say that it's due to some sort of limitation, a la: Ordinary lookup[webarch 3.1] of wordnet terms in the Web is in progress but not yet available; for now, we suggest you download the data in bulk ... [webarch 3.1] http://www.w3.org/TR/2004/REC-webarch-20041215/#dereference-uri > > Cheers, > Mark. > > [1]http://www.w3.org/2001/sw/BestPractices/WNET/wn-conversion.html#querying > > -- Dan Connolly, W3C http://www.w3.org/People/Connolly/ D3C2 887B 0F92 6005 C541 0875 0F91 96DE 6E52 C29E
Received on Thursday, 23 February 2006 15:46:39 UTC