- From: Dan Brickley <danbri@w3.org>
- Date: Sat, 26 Nov 2005 14:21:16 +0000
- To: Aldo Gangemi <aldo.gangemi@istc.cnr.it>
- Cc: Jacco van Ossenbruggen <Jacco.van.Ossenbruggen@cwi.nl>, public-swbp-wg@w3.org
Aldo Gangemi wrote: > > Hi Jacco, some comments inside > > At 10:03 +0100 26-11-2005, Jacco van Ossenbruggen wrote: > >> Review of http://www.cs.vu.nl/~mark/wn/wn-conversion.html >> >> I agree with the comments posted previously by Jeremy (see below). >> In addition, as a reader I was a bit confused about the many open >> issues. What makes things worse is that the possible solutions to >> many of the open issues are unsufficiently documented that I, as the >> reader, can form an opinion about them. >> Minor remarks: >> -Section 3, explains the prolog format of >> s(100003009,1,"living_thing",n,1,1): >> Please also explain the last three arguments, or state that they >> are explained in Appendix A >> >> -Section 4, do not forget to resolve [WHY DOES WORD NOT HAVE THESE >> SUBCLASSES?]. >> -Figure caption "The clas hierarchy of WordNet:", fix typo in class, >> remove ending colon >> -You do not use subClassOf a la Brickley. Maybe an example of how to >> get the same semantics using >> RDF meta modeling is in place? > > > The same semantics cannot be got. Ah, big discussion. What we're doing here is representing Wordnet as a lexical database. That's fine, worthy and important (and also a bridge to SKOS, where we describe conceptual entities and terms associated with them, but don't model natural language so explicitly). What I did, was build a simple-minded ontology FROM the structures captured by Wordnet hypernyms. I think the semantics are in there. The data is bad, scruffy, sure. But the *meaning* of wordnet "hypernym" as defined does carry a semantic that can be captured in rdfs:subClassOf. HOWEVER this doesn't mean that all RDF representations of wordnet should do this: it is useful, but so is the lexical view. A machine-friendly relationship between the two approaches (wordnet-as-words vs wordnet-noun-hierachies-as-a-model-of-the-word) would be an interesting addition, btw. http://wordnet.princeton.edu/gloss [[ hypernym The generic term used to designate a whole class of specific instances. Y is a hypernym of X if X is a (kind of) Y. hyponym The specific term used to designate a member of a class. X is a hyponym of Y if X is a (kind of) Y. ]] (hmm thought there was a more subclassy definition somewhere else in the wordnet docs somewhere.... it does sound more like rdf:type than rdf:subClassOf here...) > subClassOf formally means set inclusion, while "hypernymOf" is only a > property, which is formally equivalent to the existence of an ordered > pair across two sets. Moreover, while "set" in the first semantics is > the extension of the class of individuals named by a synset, "set" in > the second semantics is the extension of the class of all synsets. It is possible to do it both ways. If we do it with 'hypernym of' being a plain property, we are building a representation of the English language as seen from Wordnet. If we do it with rdfs:subClassOf, we are building a representation of the *world* as seen from the parts of English language expressed in Wordnet noun hierarchies (ie. not touching on verbs, events, etc). > Technically, a mapping could be done between the two semantics, but > the interpretation of all synsets as classes and of all hypernymOf > relations as subClassOf is untenable wrt intuition, because many > synsets refer to individuals, ...that's a bug in the data, not the metamodel, one might argue. > many hypernymOf relations refer to instanceOf (rd:type), and there are > other problems. This means that semantic porting needs data > reengineering, not just schema translation. Yes, it wouldn't make a very high quality ontology. But often, RDF users know which words "make sense", eg. I might use "Cowboy Hat" but not "Paris" as an RDF class in my data, since it is (semi-)obvious that the latter isn't a good term to use as a class. So, my approach has been to expose all of Wordnet (the old 1.6) as URIs, and people use the ones that work as categories, and ignore the ones that should never have been classes. > Similar problems have been shown for many thesauri in the past and in > particular in the SKOS work. SKOS helps reflect these ambiguous 'broader' structures into RDF, and therefore - i hope - helps us articulate a roadmap from the world of thesauri into the world of ontologies... > A second draft (if time permits) should treat the semantic porting of > WordNet. Of course, an example can be added also in the current one. > >> -The document suggest there has not yet been contact with Princeton >> about the namespace. Should this not be >> done before going public? If not, has a meeting with Princeton >> already been scheduled? > > > The contact has been created months ago, and we have just sent a > message to Christiane Fellbaum to point her at the material for the > port, and eventually create the namespace. If you could cc: the Working Group list on that stuff, it'd help with transparency, so everyone in the taskforce (and rest of the group) know where things are up to. Eg. there's a question of "what should go at the namespace" which is very relevant both the SKOS/PORT and Vocab Management taskforces (Alistair's work in particular...). cheers, Dan >> -How to generate URIs for other languages? Related to >> resolving:[THIS IGNORES LANGUAGE ISSUE! should we append language >> indicator?]. Also related: URI vs IRI (How to deal with non-latin1 >> languages). >> Do translations use the same Prolog format? Works the converter >> program also for these translations? >> -In appendix A, would it make sense to adopt the prolog convention of >> writing Variables with a starting capital? >> As a prolog programmer, it took me a while to realize what was a >> atom, literal or variable/placeholder in the prolog code fragments. >> >> Jacco >> >> Jeremy Carroll wrote: >> >>> >>> >>> Reviewed document: >>> http://www.cs.vu.nl/~mark/wn/wn-conversion.html >>> >>> >>> 1. the abstract is not an abstract >>> >>> 2. abstract/sotd or intro needs to set expectations about target >>> audience and contribution of this document, and its non-objectives >>> >>> i.e. >>> [[ >>> The TF should produce guidelines for transforming existing wordnets >>> into >>> an RDF/OWL representation. Guidelines should describe strategies for >>> converting wordnets-like structures into an RDF representation, as well >>> as strategies for re-describing in RDF/OWL the content originally >>> conveyed in the wordnets. >>> ]] >>> >>> 3. URI issue could/should be expanded, highlighted somewhat. >>> Covering: >>> - do the terms like synset etc need a different URI from the terms in >>> the wordnet itself (e.g. #bank-1) >>> - different URIs for different versions? >>> - hash (one huge file) versus slash (303 response? WebArch issue) >>> >>> Jeremy >> > >
Received on Saturday, 26 November 2005 14:21:01 UTC