- From: Mark van Assem <mark@cs.vu.nl>
- Date: Wed, 29 Mar 2006 11:45:30 +0200
- To: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
- CC: SWBPD list <public-swbp-wg@w3.org>
Hi David, Thanks for looking into this. > may be handy as an easy query mechanism. However, I don't know whether > applications would already know that they want only noun usages of > "bank". So perhaps something like > > http://wordnet.princeton.edu/wn/wordsense/bank/ > I see what you mean. There's one problem left: this approach would give a URI clash for all the sub-types of wordsenses: http://wordnet.princeton.edu/wn/wordsense/noun/ can refer to all NounWordSenses, or to the WordSenses with a Word with the lexical label "noun". Some kind of prefix could be introduced to solve this e.g. "type-noun". > would be better. Also, the language may need to be indicated, so > perhaps something like the following would work better: > > http://wordnet.princeton.edu/wn/wordsense/en/bank/ Why is that necessary? We already know that WN is in "en-US". URI clashes with other WN's can be easily avoided by having another base URI. > 2. Starting from a particular URI that is an RDF node in a triple, look > up related information. In this case, I don't think the application > would (or should) know to deconstruct the URI in order to do a broader > query, so I don't think the above mechanism would be appropriate for > this usage. (But please correct me if you think I'm wrong.) Well, I think an application should not rely on this. But in practice it would probably be programmed to do so if it gets the job done. I think this is slippery terrain, but I can't decide either way. > BTW, one thing I notice in looking over the WordNet document[1] that you > mentioned: It seems a little odd that there are different lexical > conventions used for forming the different kinds of URIs that are used. I am revising this proposal in response to points raised by Kjetil [2]. The current idea is going towards something like http://wordnet.princeton.edu/wn20/synset/type-noun/107909067-bank/ http://wordnet.princeton.edu/wn20/wordsense/type-noun/bank/1 http://wordnet.princeton.edu/wn20/word/bank/ http://wordnet.princeton.edu/wn20/schema/participleOf > For example, the document shows the following NounSynset, WordSense and > Word URIs (respectively): > > http://wordnet.princeton.edu/wn20/107909067-bank-n/ > http://wordnet.princeton.edu/wn20/bank-noun-1/ > http://wordnet.princeton.edu/wn20/word-bank/ <snip> > This seems odd for a few reasons: > > 1. If I've understook correctly, in a URI like > > http://wordnet.princeton.edu/wn20/107909067-bank-n/ > > the "-bank-n" that is tacked on after the synset ID is redundant, > presumably provided as a convenient reminder to a human reader. I think Correct. > providing this human convenience is a good idea (very helpful in > debugging), but I'm also wondering: shouldn't a particular word sense be > enough to unambiguously identify a particular synset? Is the synset ID > really needed? Couldn't the above URI be more conveniently constructed > as: > > http://wordnet.princeton.edu/wn20/synset/bank-noun-1/ You mean that here you use the information of one of a synset's wordsenses to unambiguously identify the synset? This is possible because a WordSense belongs to exactly one Synset. However, I'd change it to: http://wordnet.princeton.edu/wn20/synset/bank/noun/1/ because this allows more flexibility in the possible URIs that can be used as queries, e.g. http://wordnet.princeton.edu/wn20/synset/bank/noun/ returns all nounsynsets, and: http://wordnet.princeton.edu/wn20/synset/bank/ returns all synsets with a wordsense that has the word "bank". > i.e., "synset/bank-noun-1" acts as a unique synset identifier. Would > that work or did I misunderstand something? It would be nice to get rid > of the arbitrary synset ID numbers if they are not needed. Good point. Not sure some users would like the ID to stay because they use that? Note that such a use would imply parsing the URI and inferring information from the URI. > 2. Sometimes "noun", "verb", etc., are abbreviated as "n", "v", etc. and > sometimes they are spelled out. Yep, I agree. That bothered me also, so the new proposal always spells them out. OK, so the new proposal would be: http://wordnet.princeton.edu/wn20/synset/bank/type-noun/1/ http://wordnet.princeton.edu/wn20/wordsense/type-noun/bank/1/ http://wordnet.princeton.edu/wn20/word/bank/ http://wordnet.princeton.edu/wn20/schema/participleOf/ (the type-prefix is only required for wordsenses, but it is probably more consistent to also use it in the synset URIs). I will implement this proposal for URIs for RDF nodes into the new draft (should be done monday before the telecon). I'd like to leave the other discussion (the mapping between URIs that do not correspond to concrete RDF nodes but rather to sets of them) for later. The reason for this is that I'd like to get a new version out by monday so that the draft may reach First WG Draft status before the end of the charter. If I am correct the issue is orthogonal to the main focus of the Note, namely a correct conversion of WN to RDF/OWL. I will describe this as a discussion issue in the new version of the draft [1]. Could you live with that? With regards, Mark. [1]http://www.w3.org/2001/sw/BestPractices/WNET/wn-conversion.html [2]http://lists.w3.org/Archives/Public/public-swbp-wg/2006Mar/0076 -- Mark F.J. van Assem - Vrije Universiteit Amsterdam markREMOVE@cs.vu.nl - http://www.cs.vu.nl/~mark
Received on Wednesday, 29 March 2006 09:45:42 UTC