Re: lexicalization count from John P. McCrae on 2014-05-30 (public-ontolex@w3.org from May 2014)

From: John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>
Date: Fri, 30 May 2014 02:23:04 +0200
To: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Cc: public-ontolex <public-ontolex@w3.org>
Message-ID: <CAC5njqqMCS1Hn=a5U2VamsnkHF6uoo+A9-UwoAcaU5oG_Bur5Q@mail.gmail.com>
Hi,

The point for discussion of these should be in the final specification, the
model files should reflect the agreement of the CG.

With regards to the particular points

1) For backwards compatibility we should stick with the Monnet Lemon name
of "definition". There's also a few other properties from lemon that we
should consider importing to assure model compatibility.

The issue with BabelNet has to do with Monnet Lemon not having synsets
(hence the domain conflict with lemon:definition) and the fix is easy (add
LexicalConcept as a domain of definition).

2) I also dislike the name "contains", but there was significant discussion
here and "lexicalizes" was rejected previously. In WordNet-RDF we used the
property synset_member, so contains is possible, but I think maybe we need
to reopen this debate?

3) No. Seriously... We have discussed this so many times...

A translation between senses has very different meaning to that between
entries, that is we cannot say that "bank" is always "Ufer" in German, but
we can say bank_en#2 is always ufer_de#1. We cannot "overload" two things
with different meanings! Furthermore, it is my opinion that we should help
people to model their resources well by not supporting poor modelling
decisions (like ambiguous translation links).

Regards,
John





On Thu, May 29, 2014 at 10:37 PM, Philipp Cimiano <
cimiano@cit-ec.uni-bielefeld.de> wrote:

>  Dear John, all,
>
>  I was to propose a number of changes to the ontolex core and vartrans
> model and I had introduced them already in the OWL files. But John was very
> quick in noticing these changes and pointing me to the fact that they are
> not in line with the current spec. Well, I should first have discussed
> these proposed changes in the list, which I am doing now:
>
> 1) I propose to introduce a property ontolex:gloss as a subclass of
> rdfs:comment to allow for adding definition of senses. While one could use
> rdfs:comment for sure, people will be looking for such a property. The
> recent work by Roberto Navigli on transforming Babelnet to lemon shows that
> people look for such a property and, if not available, reinvent it
> themselves.
>
> 2) I propose to change the property contains (dom: Lexical Concept, range:
> Lexical Sense) into a property called "lexicalizedBy" and the inverse
> "lexicalizes". The reason is that working with the model to transform some
> resources (e.g. TBX, see forthcoming email on this), I realized that
> "contains" suggest a meronymic relation that need not be there in a strict
> sense. It is sort of there in WordNet-style resources where the Synset is
> regarded as a set that *contains* senses. However, this treatment seems to
> be too specific for WordNet style resources. In general, what I think this
> relation should say is that a certain LexicalConcept is lexically expressed
> by a number of senses (in different languages). Therefore, I favour the
> relation "lexicalizes".
>
> 3) I propose to redefine the translation relation so that it can hold also
> between Lexical Entries instead of Lexical Senses. I realized that in many
> cases, lexical resources abstract from the particular senses that are
> translations of each other. This is the case for many bilingual
> dictionaries. I propose thus to overload the translation relation so that
> the following holds:
>
> variantSource o trans o variantTarget -> translation
>
> sense o translation o sense^-1 -> translation
>
> where Translation \equiv exists trans.Self
>
> Let me know your comments,
>
> Philipp.
>
> Am 28.05.14 18:06, schrieb Armando Stellato:
>
>  Dear Philipp,
>
>
>
> thanks very much for your resuming email.
>
>
>
> I will reply to it more in details asap, in the meanwhile, a short note
> about the “numberOfXXX” properties.
>
>
>
> I would go for names which are homogeneous with VoID similar properties
> (void:entities, void:triples), and thus, have something like:
>
>
>
> lime:lexicalEntries
>
> lime:lexicalizations
>
> lime:senses
>
> lime:references
>
>
>
> (modulo ratios obviously :DDD ).
>
>
>
> Cheers,
>
>
>
> Armando
>
>
>
>
>
> *From:* Philipp Cimiano [mailto:cimiano@cit-ec.uni-bielefeld.de
> <cimiano@cit-ec.uni-bielefeld.de>]
> *Sent:* Wednesday, May 28, 2014 3:06 PM
> *To:* public-ontolex@w3.org
> *Subject:* Re: lexicalization count
>
>
>
> Armando, all,
>
>  yes that would be ok from my point of view.
>
> // counting properties (datatype properties, with domain (ontolex:Lexicon
> OR ontolex:Lexicalization OR void:Dataset OR lime:LanguageCoverage)
>
> lime:numberOfLexicalEntries
> lime:numberOfSenses
> lime:numberOfLexicalizations (denote-tirples)
> lime:numberOfReferences -> the number of distinct references used
>
> We then need to discuss whether we should also include ratios etc.
>
>
> Then:
>
> lime:language (unified with ontolex:language, extended here to domain
> lime:LanguageCoverage
>
> lime:linguisticModel: describing by which model/vocabulary information
> about lexicalization is attached; the domain is void:Dataset and the range
> is the URI of the vocabulary; lime:linguisticModel is thus a subproperty of
> void:vocabulary
>
> Note that several linguisticModels can co-exist in principle in a
> dataset...
>
> lime:type: providing a type for the resource in question, e.g. bilingual
> lexicon, lexicon, ..., domain is void:Dataset and range is not specified
>
> lime:languageCoverage with domain void:Datase and range
> lime:LanguageCoverage.
>
> lime:LanguageCoverage has a language, a linguistic Model and all the
> counting properties above are defined for it.
>
> If this is a base model we can agree upon then I will update the wiki
> description and the ontology.
>
> Let me know your comments on this.
>
> Regards,
>
> Philipp.
>
> Am 23.05.14 13:49, schrieb Armando Stellato:
>
> Hi all,
>
>
>
> Just copied and pasted from our Ontolex-Lime proposal , an open discussion
> about the lexicalizations count (which is not about them be ratios or
> integers :P ).
>
>
> 6. Lexicalization core triples: senses or what?
>
>
>
> Senses act as reifications of the relationships between LexicalEntries and
> Conceptual Entities (be them LexicalConcepts or entities of the lexicalized
> ontology). In effect, a single sense is always 1-1 (it links a single
> Lexical Entry with a single Conceptual Entity)
>
> The ontolex model has a shortcut for the relationship (mediated by senses)
> between LexicalEntries and LexicalConcept: ontolex:denotes.
>
>
>
> We would propose to formally consider the number of “denotes triples”
> (triples with predicate == ontolex:denotes) to obtain the count. Obviously,
> this information may not always be available (not explicit nor inferred),
> though the detail of how to obtain this are just technicalities.
>
>
>
> [added wrt the proposal] So, in shorter words, we propose to formally
> count “lexicalizations” as the number of ontoresource <--> lexicalEntry
> links, and not as the number of (linked) senses.
>
>
>
> To support our claim, please note the following case:
>
> 1.      a lexicon exists (independently of an ontology), with sense
> descriptions for its lexical entries, and with one lexical entry having
> two very close senses (two smooth variations of a broad meaning)
>
> 2.      the lexicon is used to lexicalize an ontology
>
> 3.      the authors of the Lexicalization decide to collapse the two
> senses into the same ontology concept
>
> 4.      the two triples connecting the two similar senses to the same
> ontology concept entail the same ontolex:denotes triple
>
> 5.      to the purpose of counting the lexicalizations of that lexical
> concept, the single triple count on ontolex:denotes is more appropriate
> than counting the two senses of a same LexicalEntry linked to the same
> concept.
>
>
>
> Would that be ok?
>
>
>
> Cheers,
>
>
>
> Armando
>
>
>
>
>
>
>  --
>
>
>
> Prof. Dr. Philipp Cimiano
>
>
>
> Phone: +49 521 106 12249
>
> Fax: +49 521 106 12412
>
> Mail: cimiano@cit-ec.uni-bielefeld.de
>
>
>
> Forschungsbau Intelligente Systeme (FBIIS)
>
> Raum 2.307
>
> Universität Bielefeld
>
> Inspiration 1
>
> 33619 Bielefeld
>
>
>
> --
>
> Prof. Dr. Philipp Cimiano
>
> Phone: +49 521 106 12249
> Fax: +49 521 106 12412
> Mail: cimiano@cit-ec.uni-bielefeld.de
>
> Forschungsbau Intelligente Systeme (FBIIS)
> Raum 2.307
> Universität Bielefeld
> Inspiration 1
> 33619 Bielefeld
>
>
Received on Friday, 30 May 2014 00:23:34 UTC