Re: Issues about the semantics of the ontology-lexicon interface [was: Re: Why not to shortcut the "sense" object] from Guido Vetere on 2012-10-15 (public-ontolex@w3.org from October 2012)

From: Guido Vetere <gvetere@it.ibm.com>
Date: Mon, 15 Oct 2012 14:37:07 +0200
To: Aldo Gangemi <aldo.gangemi@cnr.it>, Armando Stellato <stellato@info.uniroma2.it>
Cc: public-ontolex <public-ontolex@w3.org>
Message-ID: <OFF54FE15A.5FAC8A4D-ONC1257A97.0052CDAA-C1257A98.004553F7@it.ibm.com>
Aldo, Armando,

A couple of things about what you said (on the rest, I generally agree).

As for the name of the arrow (property?) linking senses and concepts, Aldo 
is right, maybe 'characterize' is not appropriate in this context (indeed, 
the notion comes from mathematics) and is not likely to be accepted by the 
community. But 'representedBy', if read from left to right (a sense is 
represented by a concept), could be even worse, since, in the mainstream 
of western semiotics, signs represent things and stand for them (aliquid 
pro aliquo), and not the other way around. Maybe we could adopt the 
classic (e.g. Odgen-Richard) 'refers to', even if the binding with the 
'referential function' may be inappropriate. It looks like a trivial 
naming detail, but it may have an impact on the way people grasp the 
intended meaning of the model.

This leads to the more basic question about the logic nature of this 
relation, i.e. of what kind of logical things fill the pattern: Lexical 
unit --meaning--> Sense --refers to--> Ontological concept. If we give 
this graph a DL interpretation, as I tried to do, nodes could be first 
order unary predicates and arrows (restricted) first order binary 
predicates. In this reading, instances of Sense (e.g. cat#1) would be 
related to instances of Concepts (e.g. my cat). Aldo suggests that this 
model would be in conflict with the intuition that cat#1 may in many cases 
refer to cats in general, i.e. the whole class of cats. However, 'class vs 
instance' ('intensional' vs 'extensional', if you whish) is part of the 
systematic polysemy for many senses, if not for senses in general. 
Dictionary developers might want to use the same sense of 'cat' both for 
'the cat is on the mat' and 'the cat is a feline'. Now, it is true that an 
axiom of the form cat#1 TYPE (Sense AND refersTo ONLY Cat) would not 
capture the intensional reading of the sense, but, conversely, setting 
'refers to' to range on class names, as Aldo suggests, would not capture 
the extensional one. 

In general, using class names as values for the property in question, e.g. 
by using OWL 2 punning, raises the question of providing the property with 
some extra formal semantics, since punning, as you know, is just a 
syntactic trick. As Aldo says, problems like this have been tackled by 
other specifications already, such as SKOS.  However, we here face the 
problem of dealing with any legacy ontology, which rely on standard 
set-theoretic semantics, instead of 'ad hoc' conceptual frameworks. Thus, 
we should come up with a model that preserves both the intended formal 
meaning of standard ontologies and the complexity of linguistic 
signification, which is not an easy task, and cannot be pursued just by 
naming conventions. 

In my opinion, much depends on what 'Sense' represents in our basic 
pattern. I understand well, this concept is currently associated to either 
definitions in dictionaries or synsets in wordnets, thus being a mostly 
lexicographic notion. A different ontology could model Sense as a class of 
socially constructed abstractions evoked in linguistic acts, independent 
from dictionaries and wordnets. In the former case, Sense could be a leaf 
class, and what we link through arrows are instances. In latter case, I 
think that 'Sense' should rather be the root of a class hierarchy, and 
what we link to lexemes should be Sense's subclasses, whose instances, in 
turn, represent meanings in their textual occurrences.  By the way, Senso 
Comune embraces an ontology like this.  So a good question to start with 
would be: what do we mean when we say 'Sense'?

Cheers,

Guido Vetere
Manager, Center for Advanced Studies IBM Italia
_________________________________________________
Rome                                     Trento
Via Sciangai 53                       Via Sommarive 18
00144 Roma, Italy                   38123 Povo in Trento, Italy
+39 (0)6 59662137                 +39 (0)461 312312

Mobile: +39 3357454658
_________________________________________________



Aldo Gangemi <aldo.gangemi@cnr.it> 
13/10/2012 14:40

To
public-ontolex <public-ontolex@w3.org>
cc
Aldo Gangemi <aldo.gangemi@cnr.it>, John McCrae 
<jmccrae@cit-ec.uni-bielefeld.de>, Armando Stellato 
<stellato@info.uniroma2.it>, Guido Vetere/Italy/IBM@IBMIT, Philipp Cimiano 
<cimiano@cit-ec.uni-bielefeld.de>
Subject
Issues about the semantics of the ontology-lexicon interface [was: Re: Why 
not to shortcut the "sense" object]






Hi all, I lagged behind in the last month, because of my recent 
installation in Paris. Yesterday I was traveling back from Galway (EKAW) 
and couldn't attend, apologies for that.
I have followed the recent discussion, and that's my contribution. I have 
renamed the thread, because it is now spanning over different topics 
related to the semantics ig the O-L interface.

---Senses---
Concerning Philipp's summary, firstly I agree with the decision (?not yet 
approved, it seems?) of creating the intermediate Sense class: it's 
obviously needed, either for making room for lexical senses (definitely to 
be distinguished from ontology entities), or to be able to talk about 
senses (reifications of the meaning function).
Concerning the name, I vote for "sense", because sememes, acceptations, 
and others, are either very technical for the layman, or even wrong, as 
Philipp reminds us about the original notion of sememe. The only real 
alternative would be "meaning", but I'd rather keep that term for the 
top-level class of a meaning taxonomy, as I suggest in the following.

In a previous mail, I proposed to consider also an additional solution, 
i.e. to create a taxonomy of meanings, which has ontology entities (as 
formal semantic objects) and lexical senses as special subclasses. The two 
solutions are compatible, and if we realize that a meaning taxonomy might 
be useful, it can be introduced anyway. 
Think of the sense-synset issue raised by Philipp: I agree that synsets 
are not lexical senses, if we assume that a lexical sense should be 
expressed by only one lexical unit (cardinality exactly 1), but still they 
are senses, and it's completely reasonable to put synsets (as well as many 
other creatures of lexical semantics, including sememes, acceptations, 
frames, semantic verb classes, etc.) in a meaning taxonomy.

Concerning the property names, I'm ok with both LexicalEntry ? meaning ?> 
Sense, and with Sense ? representedBy ?> OntologyEntity. 
Maybe we could get rid of multiple related uses of the "mean" notion, 
which can be somehow disturbing: Meaning as a class, meaning as a property 
between lexical entries and senses, means as a property between lexical 
entries and ontology entities ? it may look like we are playing with words 
? what about following the conventional naming patterns that employs the 
name of the property range? E.g. LexicalEntry ? sense ?> Sense ; 
LexicalEntry ? meansOntologyEntity ?> OntologyEntity. The advantage of 
using this apparently redundant naming is that at the instance level, the 
triple become very clear, e.g. Saxophone ? sense ?> wordsense-saxophone-1 
; Saxophone ? hasOntologyEntity ?> music:Saxophone.
I also prefer "representedBy" to "characterizes", because the second is 
very generic and not attested in any related literature. 

---Property chaining over senses---
Secondly, I agree with the decision to add a property chain in the model, 
which helps resolving the indirection produced by the Sense class: this is 
a good practice (a logical design pattern), used in many contexts. I do 
not see room for John's criticism about it: it does not increase the 
cognitive complexity (on the contrary, it facilitates the use of the model 
for those reluctant to catch on the sense-ontology-entity distinction), 
and the added computational complexity only holds when a DL reasoner 
materializes the ABox.
One mild problem here might be that we are making slightly different 
assumptions when we name "representedBy" the property between senses and 
ontology entities, but "means" the property between lexical entries and 
ontology entities. Since we do not have a rich axiomatization behind these 
names, we might be pragmatic and ignore the problem, however I deem 
important to justify it a little bit in the documentation. In practice, 
this approach seems to suggest that senses are actually "represented" by 
ontology entities, and this is clear and intuitive. It also suggests that 
lexical entries actually "mean" ontology entities, but this is far less 
clear and intuitive, since in no obvious way words mean stuff in 
ontologies ? it's much better to say that words have conceptualizations 
that are represented in ontologies. Indeed this is the way we talk of 
lexical senses :). That's why my above suggestion was "hasOntologyEntity", 
which however I admit ti be too generic. In principle, the compositional 
name that best fits the property chain would be 
"hasSenseRepresentedByOntologyEntity", but it's way too long, specially 
for those willing to use that property as a shortcut. Other suggestions?

---GCIs on ontology hierarchies---
Finally, a comment about Guido's observation that "cat#1 INSTANCEOF (Sense 
AND characterizes ONLY Animal)" is the right formalization for an example 
of the representedBy object property values. If I understand well, here we 
have two important issues. The first one can be solved by using OWL2, the 
second poses a more difficult challenge.
For the first issue, I think that Guido talks about OWL1, but anyway that 
axiom would give us a misinterpretation, because it would tell us that 
cat#1 is a sense that can only be represented by *individuals* from the 
class Animal, which is not what Guido wants I guess. This problem was 
described in detail by W3C SWBPD committee in 2004, and eventually some 
OWL1 solutions were recommended in the "Classes as values" design pattern. 
However, in OWL2 (lucky us) punning makes our lives easier, and a simple 
(partial!) solution is (in Manchester syntax) "cat#1 TYPES (Sense AND 
representedBy VALUE Animal)".
For the second issue, Guido points out that there are cases in which we 
need to refer to generic subclasses of an ontology entity (if it's a 
class): this cannot be expressed in OWL at all, since we cannot use the 
OWL vocabulary in the position for the domain vocabulary, In other words, 
the following is a wrong axiom even in OWL2: "cat#1 TYPES (Sense AND 
representedBy (subClassOf VALUE Animal)". 
A viable design pattern is to create a property for meaning hierarchies, 
in the vein of skos:broader or wordnet:hypernym, so that we could declare 
e.g.: "cat#1 TYPES (Sense AND representedBy ([skos:broader] VALUE Animal)
". 
However, a property like skos:broader typically applies to concepts, and 
senses would probably be compatible. Much less are ontology entities 
compatible, even though SKOS seems to suggest a loose correspondence 
between concepts and rdfs/owl classes. In particular, we should 
materialize ontology class hierarchies as skos:broader hierarchies in 
order to reason over these constructs. 
Another design pattern might resort to a specialized property, such as 
"broadlyRepresentedBy", e.g.: "cat#1 TYPES (Sense AND broadlyRepresentedBy 
VALUE Animal)". "broadlyRepresentedBy" can be a super property of 
representedBy. Of course, with this second pattern, we would lose the 
sophisticated DL reasoning that one can get with the first. Nonetheless, 
the second seems more practical and simple to apply for different levels 
of expertise.

Ciao
Aldo

_____________________________________

Aldo Gangemi
Senior Researcher
Semantic Technology Lab (STLab)
Institute for Cognitive Science and Technology,
National Research Council (ISTC-CNR) 
Via Nomentana 56, 00161, Roma, Italy 
Tel: +390644161535
Fax: +390644161513
aldo.gangemi@cnr.it
http://www.stlab.istc.cnr.it
http://www.istc.cnr.it/people/aldo-gangemi
skype aldogangemi
okkam ID: http://www.okkam.org/entity/ok200707031186131660596

On Oct 12, 2012, at 6:55 PM, John McCrae <jmccrae@cit-ec.uni-bielefeld.de> 
wrote:

On Fri, Oct 12, 2012 at 6:35 PM, Armando Stellato <
stellato@info.uniroma2.it> wrote:
>From what I got, and hope not to be wrong (it?s useful also for me to 
clarify as I missed a couple of calls on September), OntologyEntity is a 
generic rdf:Resource of one of the main entities in the main vocabularies 
(aka: OWL and SKOS, thus: property, class, individual, skos concept?).
Another question to John from my side: from your email it seemed to be 
against stating the propertyChain axiom on (means, 
<meaning,representedBy>) implying that the direct Entry ---means--> 
OntologyEntity from "Lexical Entry -> meaning -> Sense -> representedBy -> 
OntologyEntity"  but then the sentence: ?Here the difference is 1 named 
elements vs. 3 named elements, but as stated above, at least half of users 
(data consumers) will have to understand all 4 names...? instilled some 
doubt in my interpretation?
 
Are you voting against the larger structure as a whole (thus keeping only 
the Entry ---means--> OntologyEntity structure), or against the 
propertyChain axiom? I really got the second, though I?m not even sure how 
adding the p.chain axiom (or not doing it) would change anything for the 
user or consumer. I?m sure I?m missing something, so sorry in advance for 
my potential misinterpretation.
Sorry it isn't clear: the long chain is TBMK agreed upon (Lexical Entry -> 
meaning -> Sense -> representedBy -> OntologyEntity)*... we are 
questioning whether we need the short chain (Entry ---means--> 
OntologyEntity) as well. I say it is not worth it.

Regards,
John

* or (Word -> sense -> Sememe/Acceptation -> characterizes -> 
rdf:Resource/skos:Concept/owl:Entity) or some combination of these terms.

 
Have a nice we!
 
Armando
 
 
From: Guido Vetere [mailto:gvetere@it.ibm.com] 
Sent: Friday, October 12, 2012 6:08 PM
To: public-ontolex
Subject: Re: Why not to shortcut the "sense" object
 
All, 

I apologize for missing the call today. Here just some short remark. 

"Entry ---means--> OntologyEntity" means that if you want to predicate on 
the meaning relationship (e.g. to associate some grammatical constraint) 
you have to resort on a meta predicates (e.g. OWL Annotations). 

"Lexical Entry -> meaning -> Sense -> representedBy -> OntologyEntity" 
sounds good, but instead of 'representedBy' I would say 'characterizes' or 
something alike, meaning that a linguistic sense gives a (cultural) shape 
to an entity. Moreover, it is not clear to me (maybe you discussed about 
that) whether OntologyEntity is a first order TOP concept (e.g. equivalent 
to OWL Thing). In this case, note that in order to tell that the instance 
of Sense 'cat#1' (i.e. the first sense of the lemma 'cat') represents an 
Animal, you have to write something like: 

cat#1 INSTANCEOF (Sense AND characterizes ONLY Animal). 

Is it correct? 

If there is something that I can do, please let me know. 

Regards, 

Guido Vetere
Manager, Center for Advanced Studies IBM Italia
_________________________________________________
Rome                                     Trento
Via Sciangai 53                       Via Sommarive 18
00144 Roma, Italy                   38123 Povo in Trento, Italy
+39 (0)6 59662137                 +39 (0)461 312312

Mobile: +39 3357454658
_________________________________________________ 


John McCrae <jmccrae@cit-ec.uni-bielefeld.de> 
Sent by: johnmccrae@gmail.com 
12/10/2012 16:35 


To
public-ontolex <public-ontolex@w3.org> 
cc

Subject
Why not to shortcut the "sense" object
 








Hi all, 

As discussed today in the telco there is a proposal to introduce a 
shortcut replacing "Entry ---sense--> Sense ---representedBy--> 
OntologyEntity" with "Entry ---means--> OntologyEntity", while this is 
theory sounds good, I contend that in practice it is not worth the effort. 
(This is based on practical experience with the lemon model). 
It does not make the model easier to use: It is clear that for data 
producers this proposal simplifies the matter (as less links and URIs are 
required), however for data consumers it complicates the models (as they 
need to understand both methods of linking and be able to infer 
equivalence between the two methods). Thus, if EaseOfUse = (% of 
Consumers) × EaseOfUse(Consumer) + (% of Producers) × EaseOfUse(Producer), 
hence if we assume there will be approx. as many producers as consumer 
then we need only ask is it worth "is the extra effort for the producer 
less than that for the consumer", i.e., "would you rather implement a 
system that infers similarity across multiple representations, or use 
extra links and URIs"? 
It does not make the model easier to understand: While, I understand that 
the sense object is nebulous and difficult per se to understand, I would 
still argue that the clearest measure of how easy to understand a model 
is, is the number of named elements it has (as many users may not need to 
deeply understand the meaning of a sense, but be happy to know that 
"translation", "antonymy" and "register" go there). Here the difference is 
1 named elements vs. 3 named elements, but as stated above, at least half 
of users (data consumers) will have to understand all 4 names... if we 
assume out of the producers 70% do not need to represent senses (and thus 
any associated properties, "translation", "antonymy", "register") then the 
average number of links a user will need to understand is 4 × 0.5 + 3 × 
0.5 × 0.3 + 1 × 0.5 × 0.7 = 2.8... so it makes the model all of 7% easier 
to understand! Worse, this figure is overgenerous as: I expect there to 
more data consumers than producers and I expect at least 50% of users to 
require sense modelling.
Regards, 
John 

IBM Italia S.p.A.
Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) 
Cap. Soc. euro 347.256.998,80
C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153
Società con unico azionista
Società soggetta all?attività di direzione e coordinamento di 
International Business Machines Corporation

(Salvo che sia diversamente indicato sopra / Unless stated otherwise 
above)



IBM Italia S.p.A.
Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) 
Cap. Soc. euro 347.256.998,80
C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153
Società con unico azionista
Società soggetta all?attività di direzione e coordinamento di 
International Business Machines Corporation

(Salvo che sia diversamente indicato sopra / Unless stated otherwise 
above)
Received on Monday, 15 October 2012 12:37:53 UTC