W3C home > Mailing lists > Public > public-swbp-wg@w3.org > December 2005

Re: [WN] Fwd: WordNet Namespace

From: Jeremy Carroll <jjc@hpl.hp.com>
Date: Tue, 13 Dec 2005 14:13:08 +0000
Message-ID: <439ED6F4.2050109@hpl.hp.com>
To: Jacco van Ossenbruggen <Jacco.van.Ossenbruggen@cwi.nl>
CC: Aldo Gangemi <aldo.gangemi@istc.cnr.it>, public-swbp-wg@w3.org, schreiber@cs.vu.nl, mark@cs.vu.nl, Benjamin.Nguyen@inria.fr

Jacco van Ossenbruggen wrote:
> 
> Jeremy Carroll wrote:
> 
>> How big is the file if you use hash URIs?
> 
> I'm not sure we mean the same thing by hash URIs.  I was talking about 
> hash URIs as in http://wordnet.princeton.edu/rdf#entity versus non-hash 
> URIs as in http://wordnet.princeton.edu/rdf/entity.  

Me too.


> I do not see how 
> the difference relates to file size.  


Let's suppose we have 150MB of data.

If we have http://wordnet.princeton.edu/rdf#entity then we have one file 
of 150MB. If we want to look up this URI, we have to download 150MB from 
http://wordnet.princeton.edu/rdf and then parse it and find the triples 
concerning http://wordnet.princeton.edu/rdf#entity

If we have http://wordnet.princeton.edu/rdf/entity then we may have say 
50,000 files each of 4KB (notice this is somewhat more in total, say 200MB).

If we want to lookup http://wordnet.princeton.edu/rdf/entity then we 
download 4KB.

The latter is much more practical.

Jeremy




> BTW, the current files are using 
> hash URIs:
> 
> <?xml version='1.0' encoding='UTF-8'?>
> <!DOCTYPE rdf:RDF [
>    <!ENTITY wn 'http://wordnet.princeton.edu/wn#'>
>    <!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
>    <!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>
> ]>
> 
> <rdf:RDF  xmlns:wn="&wn;" xmlns:rdf="&rdf;" xmlns:rdfs="&rdfs;" 
> xml:lang="en">
> <wn:Word rdf:about="&wn;entity"
>    wn:lexicalForm="entity">
>  <wn:sense rdf:resource="&wn;entity-n-1"/>
> </wn:Word>
> 
> I do not see how you can safe space by changing the URIs.
> 
>> Isn't the size a showstopper for hashes in this application?
>>
>> I had always believed that this was one of the primary examples why 
>> hash URIs were not the only true way.
> 
> I'm missing something here.  Can you elaborate on this?
> 
> Thanks, Jacco
> 
> 
Received on Tuesday, 13 December 2005 14:14:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:17:19 GMT