Re: synsem module ready from Fahad Khan on 2015-05-22 (public-ontolex@w3.org from May 2015)

From: Fahad Khan <anasfkhan81@gmail.com>
Date: Fri, 22 May 2015 12:49:06 +0200
To: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Cc: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>, Manuel Fiorelli <manuel.fiorelli@gmail.com>, "public-ontolex@w3.org" <public-ontolex@w3.org>
Message-ID: <CAK+N+9g78ELTA52_5T8MhX2PKVkyyzO5osTD1V-t9qyJbbEGMA@mail.gmail.com>
Hi everyone,

Sorry for the delay in my response; I couldn't quite get round to it last
night. Please find my replies below.

I look forward to the telco this afternoon.

Cheers,

Fahad

In response to Philipp:
Re:pragmatism and Model C: I would argue that adding the distinction
between two types of arguments isn't so drastic (especially as I think it's
one that will be added by many of the users of your model); but since it's
my suggestion maybe I'm also quite biased :)

I also feel a little that we're at the phase of the model specification
where it's a case of "speak now or forever hold your peace"; and I wanted
to take the opportunity to articulate some of the (long standing)
objections that Francesca and I have to the model as it stands.  I guess
the objection now counts as having been registered and given the late stage
of the proposal there's no need this afternoon to go into involved
discussions on the viability of Model C (although I would be happy to do so
if asked and I do that a little in the response to John below :D).

*'But this is a problem of the ontologies we have currently around, not of
ontolex itself. You are right that many often the concept we use can not be
really said to be "denoted" by the lexical entry in the strict sense. '*

I agree that this is a problem of currently existing ontologies, but it's
one that we're perhaps adding to by using the theoretically loaded term
'reference' in these cases. Maybe there's another term we can use that's
much less suggestive, perhaps even "ExternalReference" like in LMF. But
again as Philipp says it's too late to discuss that now at this stage in
the proceedings. On the other hand, I feel that the examples that I
criticised in my previous email might actually require some extra
supplemental user-defined modules on top of Ontolex to do justice to (e.g.,
with more complex ways of tying lexical entries to real world entities).

In response to John:

Model C works well under the assumptions that:
a) the Ontology will contain the semantic information that we need to
capture the meaning of say a verb, and neither more nor less, e.g., that
the ontology won't contain information about event structure that isn't
usually part of a verb's semantic frame or that in most cases we won't need
to systematically add information that most ontologies don't contain;
b) that there isn't language specific, or specifically linguistically
motived, semantic information that we want to annotate a so called semantic
but not a syntactic argument with;
c) that there aren't specific syntactic properties like the order or
syntactic arguments (or lack thereof) or the optionality of syntactic
arguments, or even the distinction between argument and adjunct, that are
syntactic and not semantic properties.

If these assumptions don't hold, or are questionable under different and
widespread theoretical assumptions, then the efficiency of model C comes at
the cost of rendering it difficult and even artificial to add important
semantic information.  I will briefly explain why I think they don't hold:
a) Take thematic roles, these are generalisations over the kinds of  roles
that arguments in verbs take, the representation of event structure in an
ontology  may very well not make these same theoretically inspired
generalisations about the participants to various events; at the same time
in the majority of ontologies, like DBpedia, there is usually a paucity of
information about the events to which verbs refer, in these case this will
needed to be added to the lexicon, which means you will inevitably end up
sullying the purity of Semantics by Reference in order to describe quite
basic lexical semantic data; finally, adding too much information
pertaining to natural language semantics in an ontology would seem to go
towards the push towards a language independent criteria for structuring
ontologies as described in e.g., the OntoClean guidelines, and argued by
numerous Ontologists;
b) Thematic roles again...these are generally regarded as pertaining to the
semantic layer (where the relation between this syntactic layer and
representations of world knowledge is complicated another reason for
questioning a) like I said above) rather than the syntactic layer of a
lexicon. So it feels odd to add this information to a generic argument
object (I'm not suggesting adding thematic roles to the model, I'm just
saying if someone did want to add them later it would be easier if the two
types of argument were distinguished) and as I said above, it's not clear
we would want to insert this in the ontology either;
c)In English there exist two syntactic frames for "to give", one in which
the third argument in terms of order is the indirect object, and another in
which the fourth argument in terms of word position is the indirect object
of the verb and is marked by the preposition "to"; in other languages like
Latin ,word order doesn't matter and I can't make such word position
assumptions about where the dative argument required by the verb dare goes.
For English with its impoverished morphology though word order matters a
lot: having indirectObject_to would mean that we lose this information.

On 21 May 2015 at 23:37, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
wrote:

>  Dear Fahad, all,
>
>  John has answered at length your email and argued why model "C" is a
> reasonable model. I am going to be pragmatic here: since three years we are
> building the whole of ontolex around model "C" and at this stage we can not
> change it anymore. My goal was to finalize discussions at this stage, not
> to rethink core aspects of the model that would lead to drastic changes.
> There has been a lot of chance to discuss such things, and they have been
> discussed.
>
> As John says, according to ontolex, the only meaning that ontolex
> considers is modelled in the ontology as classes, properties etc. Classical
> semantic arguments as known from semantic frames do not exist for ontolex
> unless they are modelled in some ontology. If they are, then there is no
> problem. We can use Semantic Frames and use these properies as references.
>
> I go into your comments in more detail below:
>
> Kind regards and talk to you tomorrow,
>
> Philipp.
>
>
> Am 21.05.15 um 15:52 schrieb Fahad Khan:
>
>  Hi everyone,
>
>
>  Here are some points for discussion that I've come up with after a wee
> bit of discussion with Francesca. Firstly, I think that to avoid confusion,
> the Class Frame should be renamed Syntactic Frame, since the term "Frame"
> is used in a lot of different contexts -- and is to used to mean different
> things even just within Computational Linguistics -- and so it would pay to
> be about what it is that is being referred to in this case.
>
> Renaming Frame to "Syntactic Frame" is indeed an option that we could
> consider tomorrow.
>
>
> Also I feel it would be better to make a distinction between Syntactic and
> Semantic arguments by making them two distinct classes. There are so many
> instances where making this distinction is useful in lexical semantics that
> I think it’s worth the extra complexity to include it in the definition of
> Ontolex itself. For example by making this distinction we can easily
> describe verbal alternations such as occur with verbs such as "to give"as
> in "Alice gave the gift to Bob" and "Alice gave Bob the gift" where two
> different kinds of syntactic structure seem to map to the same semantic
> frame. Moreover, syntactic arguments are usually associated with syntactic
> markers or morphological markings, which is not the case with semantic
> arguments which are instead usually marked with different kinds of
> properties, e.g., semantic roles such as Agent or Theme. Additionally in
> morphologically rich languages like Latin there might not even be a fixed
> syntactic order to the arguments of a verb, whereas in terms of a
> predicative semantic representation the order of the arguments does matter.
> In other words we tend to associate syntactic and semantic arguments with
> different properties, properties which are usually included in
> lexico-semantic representations of lexical entries. ( see also
> http://en.wikipedia.org/wiki/Argument_(linguistics)#Syntactic_vs._semantic_arguments)
>
>
> We will not introduce semantic roles into ontolex. This is out of the
> realm and scope of ontolex. Such roles can be defined externally in any
> ontology.
>
> If you are proposing to introduce two subclasses: SemanticArgument and
> SyntacticArgument of the class Argument, this could be considered indeed.
>
> Alternations can be modelled without any problem with the current version
> of ontolex. So "Alice gave the gift to Bob" and "Alice gave Bob the gift"
> have the same "SemanticFrame" as meaning, but they map the syntactic
> arguments differently to properties of the event "gave", i.e. the
> properties giver, given and receiver or however the properties end up being
> named in the ontology. If they are named agent, patient and beneficiary its
> fine as well. This is out of the scope of the ontolex model.
>
> The point in ontolex (Model C) is that we use the same identifier to
> represent both the syntactic and semantic argument, but this is similar to
> unification-based grammars where the semantic and syntactic properties of
> an argument are expressed in the same feature structure. We thus have one
> identifier that unifies the properties as syntactic argument and as
> semantic argument. Not sure if it helps, but I can not explain it better
> right now. John has also tried hard to explain....
>
>
> Synsem/Example 4 is unclear/a bad example since the phrase "opening film"
> can either be associated with a prepositional phrase argument headed by
> "in" (as in "the opening film in our programme tonight is...") or "of" (as
> in "the opening film of the Cannes film festival was terrible this year")
>  -- or even no prepositional phrase argument (as in "the opening film is
> terrible")  -- and so creating a lexical entry specifically based on
> "opening film at" seems inefficient (why create a separate lexical entry
> for each of the cases, and not associate an opening_film lexical entry with
> a number of prefered prepositions?). Maybe it would be better to have a
> simpler example of a relational noun like "brother" or "uncle" where the
> noun is strongly associated with a certain preposition  (in these examples
> the preposition  “of”) rather than a number of different prepositions, when
> it is expressing a related ontological Object Property.
>
>
>
> You are right, this is the reason we should mark the syntactic arguments
> for the preposition as optional, resulting in:
>
> X is the opening film of/at/for Y => dbpedia:OpeningFilm(Y,X)
>
> X is an opening film => exists Y dbpedia:OpeningFilm(Y,X)
>
> X is an opening film with Z => exists Y dbpedia:(OpeningFilmY,X) &
> starsIn(Z,X)
>
> etc.
>
> The marker could be anything btw, so we could also include a list of
> prepositions. This is going to make the example a bit more complex, but it
> would go into the line of your "preferred" prepositions. That would be more
> compact I agree.
>
> In general, however, meaning is quite dependent on the prepositions you
> use:
>
> born on => birthdate
>
> born in => birthplace
>
> play with => playmate
>
> play for => team
>
> etc. etc.
>
>
>
> Also associating the concept Alma mater in Ex synsem/example5 with a
> lexical entry other than the lexical entry "alma mater" comes across as a
> little strange -- especially since it's not at all clear that the
> expression alma mater always and unproblematically refers to an institution
> that someone has graduated from (wikipedia, different dictionaries disagree
> on this) rather than just attended.
>
>
> The property "alma mater" is defined in DBpedia as having domain Person
> and range Educational Institution:
>
> http://dbpedia.org/ontology/almaMater
>
> It is used exactly to express at which educational institutions people
> studied.
>
> It would be however correct to change "graduated from" to "studied at" as
> graduation is too strong here.
>
>
> Overall, these examples where an ontological concept is associated with a
> related lexical entry not because there’s a straightforward
> reference/extensional relation between them but because there is some kind
> of lexical entailment relation involved (e.g., the ontological concept
> refers to the result of the action described by a word/lexical entry) are
> problematic because they tend to make the "meaning" of ontolex:reference a
> bit obscure: do we use the reference relation to say that the extension of
> the word is such and such, or that a certain expression is often used in a
> language to refer to an important aspect of an event or a concept?
>
>
> Yes, I am aware of this. But this is a problem of the ontologies we have
> currently around, not of ontolex itself. You are right that many often the
> concept we use can not be really said to be "denoted" by the lexical entry
> in the strict sense. But the ontologies that are out there are the ones we
> have to live with I fear. We could have "invented" new ontologies for our
> purposes, but my personal goal has been to show how ontolex can be used
> with existing ontologies.
>
>
> Also, I agree with John the gestalt stuff in the definition of Semantic
> Frame is a bit puzzling.
>
> Cheers,
> Fahad
>
> On 20 May 2015 at 16:57, John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>
> wrote:
>
>> Hi,
>>
>>  I read through the spec and there are a few major issues I detected in
>> the first couple of sections sections (ontolex + synsem)
>>
>>  1. The definition of 'other form' still says '[Other form] should be
>> .... an abbreviation, short form or acronym'. This is incorrect and
>> contradicts the definition of lexical entry. Can we add an example
>> clarifying the representation of abbreviations?
>> 2. ontolex/example10 now doesn't make any sense... 'bank' is just two
>> words each with a different meaning. Can we change this to a word with
>> genuine polysemy... I suggest 'troll' (1
>> <https://en.wikipedia.org/wiki/Troll>, 2
>> <https://en.wikipedia.org/wiki/Internet_troll>).
>> 3. ontolex/example17 doesn't really show a lot and for some reason refers
>> to IATE for 'cat'!? (this is probably my fault...). Could we switch it to
>> 'spouse'/'marry' showing that these two lexical entries have two different
>> concepts but the same reference dbpedia:spouse
>> 4. The definition of semantic frame is at best confusing, I really don't
>> think we need to bring Gestalt Theory into this as well. My attempt would
>> be:
>>
>>  *Semantic Frames* are the meaning of a word (and hence are also lexical
>> senses) but expressed by one or more ontological predicates and their
>> arguments. This sense of the word can only be understood when all of its
>> required arguments are realized.
>>
>>  Similarly we need to change subframe to
>>
>>  *Subframe *relates a complex semantic frame to frames for each of the
>> individual ontological predicates that form the complex semantic frame.
>>
>>  5. synsem/example5 and example6 are essentially the same as example4
>> but they connect an eventive verb ('graduate' or 'die') with a
>> consequential fact ('almaMater' or 'deathYear'). This is of questionable
>> soundness although we have argued in papers it is valid when the event and
>> the consequence are in a strict bijection... still, I would prefer to drop
>> this for the spec as it adds a lot of unnecessary complexity.
>>
>>  There are a lot of other minor issues I will change directly in the
>> spec.
>>
>>  Regards,
>> John
>>
>> On Mon, May 18, 2015 at 10:41 AM, Manuel Fiorelli <
>> manuel.fiorelli@gmail.com> wrote:
>>
>>>         Dear Philipp, All
>>>
>>>  you can find my comments on the synsem module below.
>>>
>>>  In Example synsem/example2, the resource :own_frame_transitive is
>>> wrongly written :own_form_transitive.  Additionally, there are two
>>> usages of owl:subPropertyOf, which instead  should be rdfs:subPropertyOf
>>> .
>>>
>>>  The class synsem:SemanticFrame is declared to be subclass of
>>> ontolex:LexicalSense; however, in the picture representing the synsem
>>> module, the arrow representing this axiom is oriented in the opposite
>>> direction.
>>>
>>>  In the paragraph "Semantic Frames", there is a table headed "Type",
>>> "Predicate", "Example", whose first row contains *City(x)*, ?x rdf:type
>>> ontology:Person: should it be ?x rdf:type dbpedia-owl:City?
>>>
>>>  There is no example (just below the definition of synsem:isA)about the
>>> representation of unary predicates. Nor is there any example about the
>>> representation of individuals.
>>>
>>>  The definitions of synsem:{subj|obj}OfProp use the following wording:
>>> "...property represents the semantic argument with represents"
>>>
>>>  I would avoid a sequence of two "represents". Moreover, I think that
>>> "with" should be "that".
>>>
>>>  In Example synsem/example3, there is again owl:subPropertyOf.
>>>
>>> Also, In Example synsem/example4, there is again owl:subPropertyOf.
>>>
>>>  In the section "Complex Senses / Semantic Frames", there is the
>>> definition of synsem:subframe, while in the figure there is the
>>> property synsem:subsense.
>>>
>>>  Just below Example synsem/example7, there is an example involving the
>>> property father: the property should point to the child; however, the
>>> name of the property suggests to me that the object is the father (just in
>>> the same manner skos:broader points to the broader of a given concept).
>>>
>>>  I think that Example synsem/example9  should be explained in more
>>> detail.
>>>
>>>  I didn't find the definition of synsem:propertyDomain and
>>> synsem:propertyRange; then, I realized that they were moved to the core
>>> module. The diagram of the core module must be updated to include these
>>> properties, as well as the diagram of the synsem module to remove them.
>>>
>>>  I noticed that in the infobox providing the definition of propertyRange
>>> and propertyDomain, the URI still uses the synsem namespace instead of the
>>> core ontolex namespace.
>>>
>>>  Finally, I noticed a typo in the definition of ontolex:LexicalEntry:
>>> "The class lexical entry represents a unit of analysis of the lexicon
>>> that consist of a set of forms that are grammatically ... "
>>>
>>>  It should be "that consists" with an append "s".
>>>
>>>  Best Regards
>>>
>>>  Manuel Fiorelli
>>>
>>>
>>> 2015-05-13 21:47 GMT+02:00 Philipp Cimiano <
>>> cimiano@cit-ec.uni-bielefeld.de>:
>>>
>>>> Dear all,
>>>>
>>>>  I have been working on finalizing the synsem module, please check:
>>>>
>>>>
>>>> https://www.w3.org/community/ontolex/wiki/Final_Model_Specification#Syntax_and_Semantics_.28synsem.29
>>>>
>>>> The next telco to discuss the synsem module will be on Friday the 22nd
>>>> of Mai, 16:00 CET.
>>>>
>>>> Please send me any issues to discuss or comments on the synsem module
>>>> by Thurday 21st of Mai at the very latest.
>>>>
>>>> Thanks and best regards,
>>>>
>>>> Philipp.
>>>>
>>>> --
>>>> --
>>>> Prof. Dr. Philipp Cimiano
>>>> AG Semantic Computing
>>>> Exzellenzcluster für Cognitive Interaction Technology (CITEC)
>>>> Universität Bielefeld
>>>>
>>>> Tel: +49 521 106 12249
>>>> Fax: +49 521 106 6560
>>>> Mail: cimiano@cit-ec.uni-bielefeld.de
>>>>
>>>> Office CITEC-2.307
>>>> Universitätsstr. 21-25
>>>> 33615 Bielefeld, NRW
>>>> Germany
>>>>
>>>>
>>>>
>>>
>>
>
> --
> --
> Prof. Dr. Philipp Cimiano
> AG Semantic Computing
> Exzellenzcluster für Cognitive Interaction Technology (CITEC)
> Universität Bielefeld
>
> Tel: +49 521 106 12249
> Fax: +49 521 106 6560
> Mail: cimiano@cit-ec.uni-bielefeld.de
>
> Office CITEC-2.307
> Universitätsstr. 21-25
> 33615 Bielefeld, NRW
> Germany
>
>
Received on Friday, 22 May 2015 10:49:37 UTC