RE: lexicalization count

Dear Philipp,

 

Well, on some aspects, I couldn't agree more. That's the kind of things we
were actually thinking for Lime (see point 1 and 3) and which we temporarily
cut because they didn't fit the Lemon model, so we are happy to stretch the
dress :-) . I will go one by one (ok, the first point is quite long, but you
get two other very short ones for free :D ).

 

1) I propose to introduce a property ontolex:gloss as a subclass of
rdfs:comment to allow for adding definition of senses. While one could use
rdfs:comment for sure, people will be looking for such a property. The
recent work by Roberto Navigli on transforming Babelnet to lemon shows that
people look for such a property and, if not available, reinvent it
themselves.

 

Ok, just one warning here: we introduced in ontolex the class
"LexicalConcept". While pushing for it on the one side, I myself was
wondering what it was actually offering more than a skos:Concept class. The
point (and my reply to my own doubts :D ) was that sometimes classes (be it
ontology modeling, but even OO programming) are used merely as tags, to
recognize the nature of something, only for "intensional" reasons, without
any need to introduce additional fields. In our case, LexicalConcept tells
that "that semantic thing" is not a domain concept, but actually the
factorization of word senses which collapse into a common understanding
(thus resulting in these words being synonyms). There would be no lexical
concept if there were no words for it, as it exists because of the words,
and not the contrary.

Now, the use of the word "gloss" is used in literature and has been used in
traditional resources to address descriptions of both (what we would call:
wwwc) senses and (wwwc) lexicalconcepts, and it mostly depends - trivially,
and much pragmatically - on how the resource is organized. Typically, a
resource with (wwwc) LexicalConcepts has glosses attached to them (see
wordnet, where glosses are attached to synsets) while if the resource has
only senses (most dictionaries), then glosses are attached to them. 

I think it would be good to allow for both, though, what would you have in
mind? Diversify the property into two different ones? Use the same property
with domain LexicalConcept+LexicalSense?

One could say that concepts could take "rdfs:Comment", or skos:Description
and leave glosses for senses, though, well, point is, if we think that
lemon:LexicalConcepts make sense, than it should make sense for them to have
lemon:glosses as well. (this is also why I started with that intro about
LexicalConcept, sorry for the length :) ). And, after all, are not WordNet
synsets having glosses?

Another question: would "both" make sense even if used together in the same
resource? I mean, if a resource is conceptualized (i,e. has
LexicalConcepts), which is the intended purpose of sense glosses? Should
them be allowed? If yes, maybe these should be something different, for
instance telling informally if there is any slight variation in
meaning/register/context in the use of a word with respect to another one,
for describing the same LexicalConcept. And if, on the contrary, we want to
constrain the thing as described above (either to concepts, if available, or
to senses, in case there are only them), how to say that? (more or less
formally).

Sorry, I gave more issues than solutions, but I was by first interested in
knowing your opinion on the above, then we may try to assemble the final
proposal.

Also, as anticipated, this makes me think we are getting closer to those
categories we put in LIME (see section 4.1, page 6, second column in the
specific, of:
http://art.uniroma2.it/publications/docs/2013_LDL_LIME%20Towards%20a%20Metad
ata%20Module%20for%20Ontolex.pdf ).

Obviously, I'm in favor. The original objection towards properties trying to
cover any detail of (even heterogeneous) LRs is that ontolex is more
intended for representing the link between ontologies and lexicon, and not
necessarily all details of LRs, though.shouldn't we also deal with properly
(and extensively) describing LRs if nothing is already available at the
moment for them? BabelNet is just one of the many possible cases. Did we try
to map bilingual dictionaries? (mmm.point 3 below smells of that.. )

As I already said, the question is simple: either we decide that we want to
give a unique model, and everything must be arranged wrt it, or this model
should be flexible enough to cover different structural choices, and in that
case, we cannot make any step back.

 

2) I propose to change the property contains (dom: Lexical Concept, range:
Lexical Sense) into a property called "lexicalizedBy" and the inverse
"lexicalizes". The reason is that working with the model to transform some
resources (e.g. TBX, see forthcoming email on this), I realized that
"contains" suggest a meronymic relation that need not be there in a strict
sense. It is sort of there in WordNet-style resources where the Synset is
regarded as a set that *contains* senses. However, this treatment seems to
be too specific for WordNet style resources. In general, what I think this
relation should say is that a certain LexicalConcept is lexically expressed
by a number of senses (in different languages). Therefore, I favour the
relation "lexicalizes". 

 

Very short here: I think we already passed through this: the "contains" was
never appreciated by any of us, but none of us was able to find anything
better. While I hope we will find something better, I don't find
lexicalizedBy appropriate. A sense (IMHO) does not lexicalize anything. A
sense (of a LexicalEntry) to me is still a (reified) pointer to a unit of
meaning (be it implicit, if not available, or explicit, if a lexical concept
is defined). 

 

3) I propose to redefine the translation relation so that it can hold also
between Lexical Entries instead of Lexical Senses. I realized that in many
cases, lexical resources abstract from the particular senses that are
translations of each other. This is the case for many bilingual
dictionaries. I propose thus to overload the translation relation so that
the following holds:


same conclusions as for point 1) and as I said months ago, I'm totally in
favor of being able to cover different (lexical, if not generally
linguistic) resources.
Again, as for the glosses, we should decide how to implicitly/explicitly
constrain the thing: should it be really the same prop with a broader range
(would it cause some confusion?), or two different ones?
 
Sorry again for the length of point 1, it's just that, I had a sense of
dejavu, and wanted to recollect (neutrally) past pro and contra which were
already discussed on the matter.
 
Cheers,
 
Armando

Received on Friday, 30 May 2014 01:16:43 UTC