Re: Coining a specific vocabulary for synsets in the OntoLex model from Wim Peters on 2013-04-16 (public-ontolex@w3.org from April 2013)

From: Wim Peters <w.peters@dcs.shef.ac.uk>
Date: Tue, 16 Apr 2013 12:21:49 +0100
To: John McCrae <jmccrae@cit-ec.uni-bielefeld.de>
Cc: Armando Stellato <stellato@info.uniroma2.it>, public-ontolex <public-ontolex@w3.org>
Message-ID: <CAL54QL=jLb3MwQZpPowYMkEbqfc3YKKXugkq-6Yv16Y2AC6Opg@mail.gmail.com>
I don't think we should explicitly say the ontolex:LexicalForm is a
skosxl:Label. In fact, the lexical form represents the orthographic union
of surface forms of words, that is the same form (at least according to the
lemon definition, itself based directly on the LMF definition) can have
multiple strings (e.g., spelling variants, version in other writing
systems, pronunciations, segmentations, etc.) unlike a SKOS-XL label.

I can see ontolex:LexicalForm as a special subclass of skosxl:Label, which
captures all lexical information skosxl:Label does not.
In this way ontolex:LexicalForm is a natural extension of the (general)
standard.

Wrt the choice between the two for modelling WN, I think the decision
depends on how much WN information you want to capture.
The prolog version contains less information than the database files.
For the prolog version an orthographic union of surface forms is not
necessary. skosxl:Label will do, since WN prolog only contains canonical
synset elements.
I would only prefer to use ontolex:LexicalForm if you want to capture the
surface forms in the exception list files, which contain inflected forms
and base forms. Only ontolex:LexicalForm can capture this distinction.

Cheers,
Wim

On Tue, Apr 16, 2013 at 10:19 AM, John McCrae <
jmccrae@cit-ec.uni-bielefeld.de> wrote:

> Yes I agree that we should introduce a specific name in our model for
> Synset.
>
> Firstly, the modelling proposed for WordNet is based on existing modelling
> (i.e. lemon (1.0) and SKOS) so hence the usage of skos:Concept
>
> As for  a new class I am not so keen on the name SemanticIndex, I would
> assume that the best would simply be to call it Synset, so as to ease
> adoption among the wider community. Semantic index I dislike as it is for
> associated with Latent Semantic Indexing, and in this sense more of a
> signature of a concept than a concept itself.
>
> I don't think we should explicitly say the ontolex:LexicalForm is a
> skosxl:Label. In fact, the lexical form represents the orthographic union
> of surface forms of words, that is the same form (at least according to the
> lemon definition, itself based directly on the LMF definition) can have
> multiple strings (e.g., spelling variants, version in other writing
> systems, pronunciations, segmentations, etc.) unlike a SKOS-XL label.
>
> Regards,
> John
>
> **4)      **IMHO, we should coin a specific vocabulary for each element
>> of the lexicon model, and then inherit (where appropriate) from
>> SKOS/SKOSXL, to distinguish such elements which belong only to a lexical
>> resource from those of any generic KOS. In the wiki, John wonders if what I
>> called “SemanticIndex” is not a skos:Concept, and I reply: “yes it is, in
>> fact my proposal is that our vocabulary for describing lexical resources
>> can inherit from the SKOS/SKOS-XL one”. If you look at the example, even
>> John did this, as the LexicalForm is nothing different from a skosxl:Label
>> (where lemon:writtenRep could be replaced by skosxl:literalForm) though it
>> may be worth creating a dedicated class. I would thus suggest:
>> LexicalForm rdfs:subClassOf skosxl:Label
>> but to use skosxl:literalForm instead of lemon:writtenRep
>>
>> maybe, in this specific case, we can even not reinvent a name, and
>> totally reuse the skosxl:Label, which after all is not so bad and pretty
>> fitting our necessities… (as it is already related to something
>> specifically thought for language).
>>
>> On the contrary, for LLD, I would necessarily restrict the class
>> skos:Concept to the class of elements which we expect to host things like
>> the WordNet Synset class. You can see my sample extension-point above in
>> the wiki (“Examples of Modelling in RDF (Alternative approach)”), though by
>> now mean I suggest <SemanticIndex> (that was a placeholder, taken from a
>> previous work), but in any case I think “Sense” is not appropriate
>> (lemon:sense well evokes the sense relation, while I don’t like to see a
>> class of “Senses”, that is, to me being a sense is more a role in a given
>> relationship, than a intrinsic property of an object).
>>
>> ****
>>
>> **a.       **While I think that a more-specific-than-skos:Concept class
>> would be welcome for Lexical Linked Data (such as WordNet), and thus put in
>> the middle of the: LexicalEntry --> ??? --> OntologyResource  template, I’m
>> not sure that the lemon:sense (first arrow) should be necessarily
>> restricted to it. John’s use of skos:Concept in the middle suggested me
>> that even a generic well-lexicalized KOS could be used for providing
>> LexicalEntries and Senses to enrich an ontology. However, I’m still
>> thinking about it…
>>
>> ****
>>
>>
>>
>
>
Received on Tuesday, 16 April 2013 11:22:18 UTC