Re: WNet review from Kjetil Kjernsmo on 2006-03-21 (public-swbp-wg@w3.org from March 2006)

From: Kjetil Kjernsmo <kjetilk@opera.com>
Date: Tue, 21 Mar 2006 17:36:42 +0100
To: public-swbp-wg@w3.org
Cc: Mark van Assem <mark@cs.vu.nl>
Message-Id: <200603211736.43271.kjetilk@opera.com>
On Wednesday 01 March 2006 14:24, Mark van Assem wrote:
> Hi Kjetill,

Hi Mark!

Sorry for the extremely delayed response, I've been so terribly busy.

> > Hmmm, I don't know. I think many have a intuitive "file system"
> > image, and would decompose the URI
> > http://wordnet.princeton.edu/wn/2.0/bank/noun#1
> > to mean "the word senses of bank", "then just the nouns of bank"
> > and the "first noun sense". That's what I would do.
>
> Maybe I understood incorrectly, but the example you give is not the
> proposal in the draft, e.g.
>
> http://wordnet.princeton.edu/wn/bank-noun-1/

Yep, my example is how I would like it to be. :-)

> which actually has the same sequence of information, it only "chunks"
> the local part together. If other people also find this proposal
> counter-intuitive we can of course go for your proposal (but without
> the hashes I would say).

Well, my argument is not so much concerned with what kind of delimiter 
people find more intuitive, I don't think it is very important in this 
context, either way. I'm more concerned with the "reasonable chunk" :-)

> > However, your argument in the document is that: "The disadvantage
> > of hash URIs is that when a HTTP GET is done [...] the browser will
> > return the whole document", which is a valid argument for not
> > returning the whole wordnet database on that GET. It is not,
> > however, a complete argument against returning a set of senses.
>
> Ok, next version will contain a more elaborate answer (including the
> argument in my last mail). Hope that's ok?

Yup, cool.

> > I'm much more inclined to focus on returning a "reasonable chunk"
> > for any HTTP GET. Unintended retrievals of the whole WN db should
> > be avoided, but a single WordSense seems like an unreasonable small
> > chunk to me.
>
> You're saying that an advantage of your slash URI format is that
> there are more bindings to differently-sized chunks of WordNet, e.g.
> a URI for all bank-wordsenses? In effect you'd be making possible
> more GETs that return the same RDF as a SPARQL query (as opposed to
> when the current URI proposal is used, where you can only retrieve
> single WordSenses, not sets of WordSenses)?

Well, yes, that's a side effect. Whether or not that's evil is of course 
an open question. 

> Why would returning a single WordSense (its CBD) be unreasonable
> small?

Well, admittedly, my choice of "reasonable chunk" is rather arbitrary 
and not based on scientifically sound evidence or real world experience 
from application implementation. 

However, a single sense is very little data, and RDF/XML (like most XML) 
is rather verbose, once you have all the required elements and 
namespace declarations you will probably have on the order of the same 
amount of markup as actual data. While some applications would only 
look for a certain sense, and so will throw away data, other 
applications will have to take the overhead of retrieving an 
disproportionate amount of markup and use a larger number of GETs to 
achieve the desired result.

I think it would be reasonable to return quite a lot more data than 
markup on average for a given GET request. 

> > would require a (SPARQL) query, I don't think agents should infer
> > what will be returned based on the URI itself.
>
> I completely agree, but that doesn't mean we can't create URIs that
> are meaningful to humans, helping them to parse data or write GETs.
> Or do I misunderstand you here?

No, we are completely aligned here. However, I think that the 
"reasonable chunk" is more important than "intuitive delimiters".

Best,

Kjetil
-- 
Kjetil Kjernsmo
Information Systems Developer
Opera Software ASA
Received on Tuesday, 21 March 2006 16:38:16 UTC