- From: Booth, David (HP Software - Boston) <dbooth@hp.com>
- Date: Tue, 28 Mar 2006 18:26:40 -0500
- To: "Mark van Assem" <mark@cs.vu.nl>
- Cc: "SWBPD list" <public-swbp-wg@w3.org>
Mark, > From: Mark van Assem [mailto:mark@cs.vu.nl] > . . . > > I don't see it as naming sets of nodes. I see it as naming a Web > > location where you can get some useful RDF data. And that URI does > > not happen to also be an RDF node in an triple that you previously > > encountered. From this point of view, it is no different > > in principle > > from most regular Web pages that just serve data. > > So in principle there is nothing against this approach? Correct. I do not see any problem from a Web architectural point of view. > > I think the question is how users (and particularly > > software agents) > > would know that they could use such a URI, given that it is > > not an RDF > > node in a published triple. URIs should normally be treated as > > opaque, so you normally can't assume that you can just chop off the > > "1/" from the end of the URI in order to get related data. > > Right, that is something that an agent would need to know. > But such an > agent would also need to know about certain classes and properties to > achieve the same result using SPARQL. > > What would your advice be: describe this possibility in the > coming new > WN draft [1] if it is something that is/might be appropriate > or keep it > out (either as something "evil" or because it still needs discussion)? I would assume that the two main ways of using WordNet would be: 1. Starting from a word (not a URI), look up information related to that word, which will give you URIs that are RDF nodes in triples. In this usage, a URI query like http://wordnet.princeton.edu/wn/wordsense/noun/bank/ may be handy as an easy query mechanism. However, I don't know whether applications would already know that they want only noun usages of "bank". So perhaps something like http://wordnet.princeton.edu/wn/wordsense/bank/ would be better. Also, the language may need to be indicated, so perhaps something like the following would work better: http://wordnet.princeton.edu/wn/wordsense/en/bank/ 2. Starting from a particular URI that is an RDF node in a triple, look up related information. In this case, I don't think the application would (or should) know to deconstruct the URI in order to do a broader query, so I don't think the above mechanism would be appropriate for this usage. (But please correct me if you think I'm wrong.) BTW, one thing I notice in looking over the WordNet document[1] that you mentioned: It seems a little odd that there are different lexical conventions used for forming the different kinds of URIs that are used. For example, the document shows the following NounSynset, WordSense and Word URIs (respectively): http://wordnet.princeton.edu/wn20/107909067-bank-n/ http://wordnet.princeton.edu/wn20/bank-noun-1/ http://wordnet.princeton.edu/wn20/word-bank/ Aside from the http://wordnet.princeton.edu/wn20/ prefix and the trailing slash, the lexical patterns for the three seem to be (in perl): ($synsetID)-($word)-($lexGroupLetter) ($wordP)-($lexGroupName)-($n) word-($word) where $syssetID = synsetID pattern = [0-9]+ $word = word pattern = [a-zA-Z_]+ $lexGroupLetter = lexical group letter pattern = [nvasr] $lexGroupName = lexical group name = {noun,verb,adjective,...} $n = word sense number = [0-9]+ This seems odd for a few reasons: 1. If I've understook correctly, in a URI like http://wordnet.princeton.edu/wn20/107909067-bank-n/ the "-bank-n" that is tacked on after the synset ID is redundant, presumably provided as a convenient reminder to a human reader. I think providing this human convenience is a good idea (very helpful in debugging), but I'm also wondering: shouldn't a particular word sense be enough to unambiguously identify a particular synset? Is the synset ID really needed? Couldn't the above URI be more conveniently constructed as: http://wordnet.princeton.edu/wn20/synset/bank-noun-1/ i.e., "synset/bank-noun-1" acts as a unique synset identifier. Would that work or did I misunderstand something? It would be nice to get rid of the arbitrary synset ID numbers if they are not needed. 2. Sometimes "noun", "verb", etc., are abbreviated as "n", "v", etc. and sometimes they are spelled out. 3. The lexical components are not always in the same place. I would have expected something like: http://wordnet.princeton.edu/wn20/synset/107909067-bank-n/ (or http://wordnet.princeton.edu/wn20/synset/bank-noun-1/ ) http://wordnet.princeton.edu/wn20/wordsense/bank-noun-1/ http://wordnet.princeton.edu/wn20/word/bank/ Of course, you may have other compelling reasons for constructing the URIs as you have already shown that I do not know about. David Booth Reference > [1]http://www.w3.org/2001/sw/BestPractices/WNET/wn-conversion.html
Received on Tuesday, 28 March 2006 23:29:36 UTC