- From: Mark van Assem <mark@cs.vu.nl>
- Date: Thu, 20 Apr 2006 23:26:22 +0300
- To: "Ralph R. Swick" <swick@w3.org>
- CC: SWBPD list <public-swbp-wg@w3.org>
Hi all, I propose to use the URI schema below for WordNet. Argumentation follows. I will finish a version of the Draft with this proposal and send it a.s.a.p. but latest on monday morning before the telecon. This is one of the last chances to get this work into First Public Working Draft Status and I personally feel that this issue is not important enough to be the reason *not* to get this Draft out. If this is too narrow a view on my part, please object. - http://wordnet.princeton.edu/wn20/instances/synset-bank-noun-1 - http://wordnet.princeton.edu/wn20/instances/wordsense-bank-noun-1 - http://wordnet.princeton.edu/wn20/instances/word-bank - http://wordnet.princeton.edu/wn20/schema/participleOf The main reason is that it enables separate management/versioning of schema and instances. I do not think having a separate namespace for synsets or words/wordsenses actually helps, see argumentation later in this email (detailed answers to Ralph). > It turns out that custom entities can't be done within an HTML document > so I would consider it a show-stopper to choose between otherwise > technically similar options the ones that don't fit as QNames [or CURIEs :) ]. Ok, yet another reason to remove the slashes from the XML names. > This kind of usage is somewhat beyond what you have proposed > in the current editors' draft. It implies that there is some sort of > linguistic behavior of the WordNet data such that, e.g., one might > be able to say that VerbSynset is a subClass of rdf:Property. I do not want to imply any interpretation. I am just saying that someone wishing to use this interpretation is unable to do so with our former proposal (with slashes in the XML names). And I think there is some research that wants to do this interpretation or will want. > Either way, I agree that making this possibility hard to implement > by choosing names with syntactic restrictions should be avoided > if it is not otherwise inconvenient. Ok :-) > I think it is a good idea to give separate namespaces to the > terms used to model the data and the data itself. When we Ok, so at least two namespaces then. > get to writing down more best practices for vocabulary > management, I anticipate that we would find it advantageous > to be able to separately version the modelling terminology > and the instance data. Although I am not sure that one would want to provide a new version of the schema without a revised set of instances, we cannot rule out the possibility. And you seem to be alluding to other possible management issues for which it would be useful to separate schema and instances. > >>... Another option is to create property names that definately do not conflict with words, e.g. by introducing a prefix. Then we can put everything in one namespace. E.g. with URIs >> >>- http://wordnet.princeton.edu/wn20/synset-bank-noun-1 >>- http://wordnet.princeton.edu/wn20/wordsense-bank-noun-1 >>- http://wordnet.princeton.edu/wn20/word-bank >>- http://wordnet.princeton.edu/wn20/schema-participleOf > > > If we collapse everything down to one namespace then all of this > prefix information has to be repeated in each use of every term in > WordNet. This doesn't feel convenient to me -- and will upset > those who care about the size of XML documents they have to > generate (I'm not particularly one of the latter, but they do tend > to be vocal.) I am not sure that I understand this argument. Would it be crucial to have four instead of two namespaces? I am currently going for just two because I don't see the benefit of four except more flexibility at cost of more management. Splitting the schema and the instances is a more often used decision, and in all those situations the instances of ALL the classes are mixed in one namespace. > Making the synset namespace be separate gives us a nice > easy way to refer to "WordNet Basic" -- it's just the namespace > name of the synset portion. There may be more triples in > this synset part of the data than the current draft defines for Basic > but I expect not enough more to really upset users. Avoiding > a requirement to have separate names for "basic" and "full" > feels good; the application simply chooses to fetch only the parts > of the vocabulary that it needs. Actually this does not work. The synset part would be the FULL part of the data, which is different from the Synset BASIC version. The difference is that the FULL does not contain the senseLabels (the set of labels attached to all the Synset's Words) but only a single rdfs:label. Of course a simple solution would be to add the senseLabels in the online version. Then developers interested in the Synsets and labels only get what they want, and the Full users get a little duplicate data. Another solution is serving BASIC and FULL in two separate namespaces. But then the connection between FULL and BASIC is lost, and even more management needs to be done. But I think that any developer who is keen on saving triples can just download the BASIC version, even cutting off the pieces not needed as s/he sees fit. If a developer wants to use the online version because s/he only needs a few Synsets then querying a bit more to get to the Word-labels connected to the Synsets does not hurt. So I think we should just serve FULL online. > message. In any case, it is important that we work through the > details sufficiently to persuade ourselves that we have names > that work in practice and that have semantics that we will be > able to explain. I hope the above and the earlier discussions on the list are enough. To really be able to make the right decision I think we need a discussion in a wider audience and some practical experience with the data to tell us if this is ok. For now I'd like to fix this Draft as a First WD before it evaporates because the SWBP WG's time's up. Cheers, Mark. >>[1]http://www.w3.org/2001/sw/BestPractices/WNET/wn-conversion >>[2]http://www.w3.org/TR/swbp-vocab-pub/ >>[3]http://www.w3.org/2001/sw/BestPractices/WNET/wn-conversion-20060202 >>[4]http://lists.w3.org/Archives/Public/public-swbp-wg/2006Feb/0087 > > >
Received on Thursday, 20 April 2006 20:26:24 UTC