Re: [WNET] new data and schema files

Hi,

> OK. Looking at the changes you made to convertwn20.pl
> the content of the output won't change, just the names.

Correct.

> I don't understand why one is singular and the other plural, but OK :)

Artefact of changing opinions, we can change that in the future :)

> The editor's draft  [1], 24 May revision does not clearly (to me) specify
> how the senseLabel value is computed. I interpret the text
> 
>    'The property value is filled with the lexical forms that are attached
>    to Words in the Full version."
> 
> to mean that there is a senseLabel statement with the value
> of the lexicalForm property of the word in each wordsense
> in the synset.  That is, if there are multiple wordsenses in a
> synset there should be multiple senseLabel statements for
> that synset.  And one of those senseLabels should match the
> rdfs:label of the synset.
> 
> Am I correct?

Absolutely correct.

> statements for the synsets are redundant.  We could simply
> declare wn20:senseLabel to be rdfs:subPropertyOf rdfs:label
> and use only the senseLabel property in wordnet-synset.rdf.

The problem with this approach is that it becomes hard to choose which 
instance of rdfs:label to use in graphical displays.

In the current state people can still reach this by adding the 
senselabel file and the subprop statement themselves.

> The words in "Use of rdfs:label" in Appendix D do suggest that
> there are synsets with multiple distinct senseLabels.  I'll try to

Yes, that is correct, so this text can stay the same, right?

> do a query to find such a case.  BTW -- there's another typo;
> that paragraph in Appendix D refers to lexicalLabel when it
> must mean lexicalForm.

Woops. Corrected!

> Yes, it can be computed by clients but it seems that a
> preponderance of the data will have an rdfs:label property
> on each synset that exactly matches the only senseLabel
> property for that synset, so we could simply collapse the
> senselabels.rdf data into synset.rdf.

I dont follow, what do you mean with a preponderance?
Do you mean there are synsets with only one wordsense, in which
case the senseLabel matches the rdfs:label? Because of the previously 
mentioned reason I think it best to keep rdfs:label and senseLabel separate.

> And I'm thinking that if I am correctly understanding the use of
> senseLabel and rdfs:label on synsets we could make Basic
> be a proper subset of Full and eliminate one file.

See previous argument :-)
But if we agree to mix the senseLabels in you are right Basic is a 
subset of Full, which would affect the CBDs. But then still there's 
nothing against keeping them in a separate file, right?

> in Section 3 "Selecting and Querying the appropriate WN version"
> shows an inSynset property which, per Appendix D, is the inverse
> of containsWordSense though it's missing from Figure 2.  I think

The idea is to not mention the inverses in the running text as they are 
only available to OWL users.

The idea for separating like this is the following: Every property has 
an OWL inverse in the schemas. Non-OWL users can just use the property 
in reverse direction, so there is no need to introduce additional data. 
This data is actually redundant and in previous versions I did generate 
that data. Jan Wielemaker and Jacco van Ossenbruggen pointed out that 
those triples add no functionality while WN was already heavy for the 
demo server of the MultimediaN project.

Thanks for your careful reading!

Mark.

Received on Wednesday, 24 May 2006 18:42:30 UTC