- From: Dan Brickley <danbri@w3.org>
- Date: Tue, 2 Nov 2004 08:34:33 -0500
- To: public-swbp-wg@w3.org
danbri's raw notes from Wordnet breakout, day 2 of SWBP f2f. ================= Redundant grepped summary: grep ACTION SWBP_Wordnet_TF_Breakout.txt ACTION: danbri propose some URI formatting rules for word senses and synsets ACTION: guus make tests on URI labels once suggested ACTION andreas Investigate Wordnet maintainance policy re synset IDs ACTION: brian ask Aldo to review meronym superproperty decision ACTION: guus ask Jan W to write Prolog transformation into RDF/XML ACTION: danbri propose some URI formatting rules for word senses and synsets ACTION: guus make tests on URI labels once suggested ACTION andreas Investigate Wordnet maintainance policy re synset IDs ACTION: brian ask Aldo to review meronym superproperty decision ACTION: andreas investigate Sentence Frame / Fr relationship in the Prolog, find examples etc. ACTION: guus ask Jan W to write Prolog transformation into RDF/XML grep RESOLVED SWBP_Wordnet_TF_Breakout.txt RESOLVED: we'll define a custom property wn:lexicalForm that subproperties rdfs:label; it has a cardinality of precisely-1. RESOLVED: to consult w/ Princeton team (thru brian) re requirements on glossary entry, does it need xml literals? RESOLVED: remove meronym superproperty [pending review from aldo] RESOLVED: not to add a verb group for now RESOLVED: we'll define a custom property wn:lexicalForm that subproperties rdfs:label; it has a cardinality of precisely-1. RESOLVED: to consult w/ Princeton team (thru brian) re requirements on glossary entry, does it need xml literals? RESOLVED: remove meronym superproperty [pending review from aldo] RESOLVED: not to add a verb group for now RESOLVED: to include some basic OWL assertions in schema before 1st WD ================= present: guus, brian, andreas, danbri guus: if we can find some external people, eg. students, to progress this... brian: yesterday I thought we decided it was my problem guus: ok brian: useful to walkthrough... starting w/ diagram in the document guus: my main problem is with notion of wn:WordSense ...in mine I'd made it a bnode, without a URI. danbri: whether bnodes is orthogonal to vocab design guus: wordnet has IDs some things, but not for word senses ...a compound key approach ...so if you want URIs for/from word senses, you need to compose one somehow. brian: we have a general issue, against all of this, which is "What URIs to use?". Why special here? guus: generally you'd assume some princeton based identifier ...dan's concerned that numeric IDs not v usable guus: could assume URIs for synsets are composed of princeton base URI then first word from db-ordered synset, then '-' then identifier. brian: ... danbri: If we expect this design to morph into one that does nouns-as-classes, we need to think about prettyness in the rdf/xml syntax guus: re synsets... re Bank... andreas: direct link from word to synset, or indirectly? ... brian: we set out with a goal of just representing the lexical form .... discussion of metamodel based approach guus: we've said synset is subclass of class; hypernym is subproperty of rdfs:subclassof. brian: Core of structure is synssets, collections of words w/ similar meaning. they can be typed (noun, verb, ...). 'bank' has many senses; bank-as-finaincial-institution is 1 word sense. bank-at-side-of-river is another wordsense for 'bank'. eg. 'cat' a word written as 'cat' there's a sense of word as cat; ther's another which is an abbrev for caterpillar truck. there are relationships between word senses, synonym, antonym etc. and between synsets, like hyponym etc. guus, it would help, if for the lexical rep, if we could generate uris with some human readability. danbri's proposal "take the word('s lex form), and the sense number, joined by '-'. guus 'for every word sense, you have a sense number' each word sense in a synset has a sense number; they're also ordered. Film-5, isn't 5th word in a synset but the 5th sense of 'film'. guus: main problem is identifying synsets, which have ugly numeric ids ...the things in a synset are ordered; some people take the first one. guus: we are prepared to compromise the purity of having a lexical representation by taking into account usability of URI structures we invent. guus: tricky bit w/ rdf ... is ordering information ...my current rep nor this one doesn't handle ordering brian: one issue i have on my list.... there's a backbone structure, ...given this info, you can generate a whole load of other stuff, eg. inverse relations. It'd be useful to be able to define inverse properties, but that doesn't mean that we populate the triples in princeton_rdf.tar.gz or whatever ...ie which triples do we want as base vs inferred? DECISION: we will strive for human-friendly URIs (where we have them) ACTION: danbri propose some URI formatting rules for word senses and synsets guus: we can define some test cases here, can ask our implementor; to check we have genuinely unique URIs. ACTION: guus make tests on URI labels once suggested [discussion ... of multilingual labels princeton wordnet is english; doesn't make language distinctions guus: I'd assumed that we'd default to rdfs:label for cat, dog, film etc. danbri: I like naming relationships and making them subproperty of rdfs:label guus: ah, didn't realise you had a separate resource for Word brian: I was asking earlier if we really need it danbri: is wn:lexicalForm OWL Functional? brian: yup brian: when we talk about a word, is the word 'chat'(en) same as 'chat'(fr) ...I put it in as I wasn't sure if we need the indirection or not ...also somewhat historical, i had it in to avoid confusion between Word and Word sense danbri: I like having it in there guus: then we need to think abotu URIs for Words danbri: Is wn:lexicalForm inverse-functional as well? brian: depends on what we do wrt language. guus: dropping Word would simplify the model brian: doubles the triples, bloat factor, but you might use it for talking about Words danbri: is this one of those cases where we could have it in our model, but ship only the more concise shortcut representation? guus/brian: yup. could keep it in the diagram but shaded out guus: brian, your pref is to have a custom property for wn literal, and not use rdfs:label guus: is precisely once functioanlity; every word sense has exactly 1 label in any given language. danbri: could use OWL's class specific constraints? guus: also visualization tools make use of rdfs:label RESOLVED: we'll define a custom property wn:lexicalForm that subproperties rdfs:label; it has a cardinality of precisely-1. Issues discussion: use of xml literal, eg. for glossary enty, lexical form. danbri: let's do whatver skos does (or vice versa) guus: no indication now that we need anything more than plain literals RESOLVED: to consult w/ Princeton team (thru brian) re requirements on glossary entry, does it need xml literals? Guus: maintainance... versioning? danbri: this may be a difference between lexical and class-centric represntations; former doesn't involve changing the namespace (much) guus: they have a mappign table of identifiers between versions brian: I think there are now synset IDs (@@check) ACTION andreas Investigate Wordnet maintainance policy re synset IDs Q: Do we want a meronym superproperty? brian: it's complicated, I don't understand wordnet 2.0 meronyms ...there are various subproperties of meronym ...member, substance, part ...but also meronymOf guus: no brian: yes! guus: mm specifies that the second synset is a member meronym ...assuming this is the case, I reckon not a superproperty [...] brian: guus is right... guus: if other people want the superproperty, ... hmm let's be minimal RESOLVED: remove meronym superproperty [pending review from aldo] guus: would be good if Aldo and Nicola could look at this topic ACTION: brian ask Aldo to review meronym superproperty decision Q: There is a concept of a group of verbs, but no way to name of refer to a group of verbs. Invent verb group class? brian: currently there is no such class, it's minimal danbri: what's an example group of verbs? any use cases? brian: see Prolog description. Specifiees verb synsets similar in meaning danbri: next layer up of clustering above synsets? brian: cluster of synsets, presumably w/ closure guus: 'vgp' in the Prolog brian: there is a concept specified there, a notion of Group, as you try to infer the abstract model RESOLVED: not to add a verb group for now danbri: what I think we decided earlier: "We expect future revisions of this work to explore the elaboration of the lexical representation of Wordnet by making explicit some relationship between nouns and RDF classes. We have not decided that this is a design constraint on the lexical representation, beyond favouring human-friendly URIs (which might be usable in RDF/XML typedNode XML elements); we have not committed to a metamodelling approach that uses the same URIs for nouns and their associated classes". [nobody objects to this, but not much enthusiasm for making it a format resolution.] Q: How to represent the Fr relation? brian: I didn't understand what it meant! guus: me neither ACTION: andreas investigate Sentence Frame / Fr relationship in the Prolog, find examples etc. Q: re new relations in Wordnet 2.0, ... brian: someone needs to check the database files, not those in the prolog ...go thru database schema files, see if anything missing in the prolog guus: i've always assumed the Prolog represntation is complete ...and that therefore we can base tools from the Prolog ...if that assumption is not true, we're in trouble guus: shoudl we explicitly record that assumption brian: is there any mention in the Prolog of stable synset IDs danbri: re questions for Princeton, ask on Wordnet users list rather than direct to Princeton, as many others can answer those Qs brian: ok re tag-count... there is a corpus that's counted. It changes release to release. Student project work found the number in the db not so useful. Wanted more of a relative frequency count. ACTION: guus ask Jan W to write Prolog transformation into RDF/XML guus: would be good to have test cases danbri: could use OWL for data integrity checks brian: no OWL in the schema yet guus: I only did stuff that was clear from the schema (symmetric/reflexive etc) danbri: agree symetric useful, but can we use it for data integrity checking? (as can w/ Functional etc) ... general agreement to get the OWL in there, but when? brian: is it essential for first WD? guus: it is essential for symmetric properties danbri: worth doing, yup brian: ok guus: i've listed most of these RESOLVED: to include some basic OWL assertions in schema before 1st WD guus: we can also maybe make transitive closure triple dump ADJOURNED for what remains of lunch.
Received on Tuesday, 2 November 2004 13:34:34 UTC