Re: [WN] Fwd: WordNet Namespace

Hi Jacco,

> rdf files themselves and I think the sheer size is worthwhile 
> discussing.  The various files add now up to over 150MB uncompressed 
> RDF/XML, when loaded in SWI-Prolog it gives a memory footprint of over 
> 300MB.
> I think it is usefull to
> - at least mention the footprint so users are warned
> - compare the footprint to the other conversions, explain the difference 
> and argue what the benefits are

Ok, this can be done.

> - think about the possibillity for a lean and mean version.

The "convenience" requirement might be satisfied better by (a) 
removing the inverses like you and Jan argued before; and (b) separate 
the files into e.g. separate ones for the noun and verb hierarchies.
Maybe this already gives enough reduction?

Something that I would like your input for is the question what the 
relation between size and convenience is. It is not very fair to 
compare this conversion to e.g. one that does not have all hierarchies 
or does not have all relationships. Note that I already put each 
relation in a separate file, so that's configurable and allows for a 
more fair size comparison.

> If most users end up ingnoring this version because other versions are 
> so much smaller, this would be in strong conflict with the second "it 
> should be convenient to work with" requirement mentioned in [1].

That is an important problem. But then we need some way of telling 
what is convenient and what is not.

One point where we might gain a lot (reduce size) is by representing 
word(senses) directly as labels on synsets. But then you lose the 
ability to annotate with WordSenses. So my concrete question is: is it
desirable to lose this ability in trade for a size reduction?

Cheers,
Mark.

-- 
  Mark F.J. van Assem - Vrije Universiteit Amsterdam
        markREMOVE@cs.vu.nl - http://www.cs.vu.nl/~mark

Received on Tuesday, 13 December 2005 12:15:28 UTC