Re: synsem module from Fahad Khan on 2014-08-29 (public-ontolex@w3.org from August 2014)

From: Fahad Khan <anasfkhan81@gmail.com>
Date: Fri, 29 Aug 2014 11:35:49 +0200
To: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Cc: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>, public-ontolex <public-ontolex@w3.org>
Message-ID: <CAK+N+9i7r41bKLSapfiVQ0Dm28UqneT9emjmX+1vo7O25SmY-A@mail.gmail.com>
Hi everyone

Here are a few responses to John and Philipp’s comments. Hopefully we can
discuss these further in the call and afterwards in the list too. One thing
I would like to point out at the start is that even though the emails are
being sent out under my (Fahad’s) name and I’m doing the majority of the
typing most of the work on the model is Francesca’s.  Hopefully in the call
today most of the explanation will be her's too :)

Our main motivation here is our resistance to stripping all semantics from
the lexicon part especially with respect to the conversion of legacy
resources. In principle we agree with a lot of John &  Philipp’s remarks
that go in the direction of preserving semantics by reference. But it is
difficult to see how this impacts us, as people who have a legacy resource
(such as Parole Simple Clips) and want to use the ontolex model to publish
it.

Practically speaking we don’t know what to do with the PSC semantic layer.
On the one hand Philip reminds us that Ontolex deals primarily with "given"
ontologies. That leaves our semantic layer out. As you know we have tried
to be faithful to the idea of semantics by reference in converting PSC
using lemon; but we also wanted to publish all the semantics of PSC; this
forced us to create a new ontological level to accommodate our semantic
layer.

But this to be honest is not really a well formed ontology, and can hardly
be pointed to by other lexicons (other languages...) without a lot of
manual checking. This is not what we want... we want ontologies that are
reusable even independently from the original lexicon.

Our concern is that people with a legacy resource, are just going to choose
the easy way, use the "lexical" basics of the model, like the lexical
entry, the canonical form.... and then add/define their semantic stuff on
top of it, as an extension to the lexical model, that is without using
"reference" to an ontology. Basically they'll add their semantic layer the
way they want it.

Nevertheless, for the sake of the argument, why don't we take a resource of
some complexity and try to see how it accomodates in your best model in a
way that is really faithful to the ontolex philosophy, and at the same time
leaving as little information out as possible.

We are thinking of completing Parole Simple Clips, as a test case for this,
but it's a big beast. We have started to do this, but when you tackle the
verbs and the predicates, it's even more complex. Maybe this will give us
an idea of how much adjustment legacy resources would require to be
faithful to the "semantics by reference"  model, and how reusable the stuff
that ends up on the ontological side.
Cheers,
Fahad & Francesca



On 29 August 2014 08:34, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
wrote:

>  Dear Fahad, all,
>
>  I finally had the chance to look at your proposal in more detail, I think
> it is more in line than we might expect at first sight with the example
> that I provided a few weeks ago. I attach the example again for the sake of
> easier reference. In particular, I think that:
>
> 1) The *PredicativeRepresentation*s you are proposing corresponds to the
> *SemanticFrame*s that I was proposing. It sort of represents "the complex
> predicate expressed by a lexical entry", where the atomic parts come from a
> given ontology. Our proposals differ in that I was attaching the
> SemanticFrames to the SyntacticBehaviour via the relation "semFrame", then
> linking the frame to the sense. However, we can of course link the "sense"
> to the Frame as you propose and then link the Frame to the corresponding
> syntactic behaviour. Both are fine from my side. If you think your
> modelling here is better, then I have no problem in endorsing it.
>
> 2) As John mentioned, our building assumption is that predicates per se
> are *only* in the ontology. In this sense, the first decision to make is
> whether sell and buy denote the same concept in the ontology (lemon and
> myself are agnostic in this respect, this is a conceptual decision to
> make). The different perspectives you mention could be modelled by the
> SemanticFrame class that I was proposing, with different mappings between
> syntactic and semantic arguments. Information about semantic roles can be
> attached as annotations, that's not a problem. Further, the ontolex model
> allows you to have two different senses for sell and buy that nevertheless
> link to the same ontological class/predicate.
>
> 3) Note that ontolex was used to interface a lexicon with a given (domain)
> ontology, not a linguistic ontology. Agent / Themes / Beneficiary are
> linguistic roles rather than roles/relations that would appear in a
> (domain) ontology. As John mentions we can attach these roles to the
> syntactic arguments without a problem.
>
> Let's discuss this further today. I will then try to create a new example
> that unifies both proposals, mine and Fahads.
>
> talk to you later,
>
> Philipp.
>
> Am 28.08.14 15:01, schrieb John P. McCrae:
>
> Hi,
>
>
> On Thu, Aug 28, 2014 at 2:10 PM, Fahad Khan <anasfkhan81@gmail.com> wrote:
>
>>  Dear John,
>>
>>  Thanks for your comments.
>>
>>  We partly agree on your points, especially about the redundancy of some
>> modules. We want to use this LMF style treatment as a starting point for
>> further discussion.
>>
>>  As for the the use of reference for selectional preferences we can see
>> your point (maybe instead we can use a different relation such as "domain"
>> instead of "reference").
>>
>>  What we're still not sure about is the fact that predicates should only
>> be in the ontology: where the ontology in this case represents the
>> extensions of lexical items. The problem we have is that for example, one
>> can understand the senses of "buy" and "sell" in this example to represent
>> two different predicates but just one class of "actions" (e.g.,
>> purchase_exchange_actions): where the predicate represents a different
>> "linguistically" motivated way of looking at the same class of events.
>>
>>  If you want to make "buy" and "sell" one predicate as in the Ontolex
>> example that was given earlier on, i see practical as well as theoretical
>> problems. Practically, you force all those who have two predicates in their
>> resource to go and check which should be merged.
>>
> The question of whether to model buy and sell as a single event or as two
> events that entail each other is an interesting question in general, but it
> is a conceptual modelling issue, rather than a lexical issue. As long as
> the lexicon can capture how each entry interfaces with predicates defined
> in the ontology, such details of the lexical modelling should not matter.
> It is also unavoidable that when dealing with legacy resources, some work
> will be needed to harmonize with any defined OntoLex model.
>
>>
>>  Also, what about semantic role labeling? the first argument of the sell
>> predicate is an agent according to PSC. So is the first argument of the buy
>> predicate.  It is because the same action is conceptualized in different
>> ways in language. But on the ontological level, these different roles point
>> to the same participant in the action (eg. The buyer is beneficiary in one
>> case and agent in another).
>>
>>  Overall it seems to us there exists information related to semantic
>> predicates (as they are used in lexical resources we know) which seems to
>> pertain more to word use, and to the linguistic rather than to the
>> ontological level. But, we think this would a good matter for discussion.
>>
> Such linguistic features can be captured by annotations on the arguments
> as required.
>
>>
>>  As for the SynSemCorrespondence, indeed it is verbose to implement, but
>> consider also that instead of having to laboriously map lots of individual
>> cases of syntactic and semantic arguments you can just define a reified
>> object that represents without redundancy a whole class of such mappings.
>> For instance in Parole Simple Clips, you'd have thousands of instances all
>> pointing to one class of mappings, such as IsoTrivalent, or IsoBivalent.
>> The synsemcorrespondence object enables you to do this.
>>
> As I said, the merging of the syntactic and semantic arguments as proposed
> by *lemon* is maximally efficient as it requires no extra triples, it
> also has several other advantages, most notably it is easier to query and
> work with.
>
>  Regards,
>  John
>
>>
>>
>>  Cheers,
>> Francesca + Fahad
>>
>>
>> On 28 August 2014 12:19, John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>
>> wrote:
>>
>>>  Hi Fahad, Francesca, all,
>>>
>>>  I will not be at the telco tomorrow due to being busy at Coling, but I
>>> will provide some comments on the proposal
>>>
>>>    - 'Predicates' should not be included in the modelling of SynSem, as
>>>    predicates are something clearly defined by the ontology. A duplicate
>>>    mechanism for semantics is not needed in lemon/OntoLex as we have a good
>>>    semantic model (OWL) in contrast to a pure lexicon model like LMF, which
>>>    must define its own semantic model.
>>>     - I still have no clue what a 'predicative representation' is... it
>>>    seems entirely unnecessary in LMF, but perhaps I am wrong here?
>>>     - Arguments cannot have references to an ontology, they represent
>>>    slots that should be filled in the logical representation defined by the
>>>    ontology. The proposal here seems to confuse references with domains (that
>>>    is the class of object referenced by the argument rather than the actual
>>>    values referred to by the argument, when the frame is realized).
>>>     - The SynSemCorrespondence object from LMF is frankly verbose and
>>>    unnecessarily so, it occupies 14 triples in your proposal, where as direct
>>>    linking of semantic and syntactic arguments would take only 3 triples, and
>>>    URI reuse as in *lemon* requires 0 triples! Is there any
>>>    justification for this complex and verbose modelling?
>>>
>>> Regards,
>>>
>>> John
>>>
>>>
>>> On Tue, Aug 26, 2014 at 4:06 PM, Fahad Khan <anasfkhan81@gmail.com>
>>> wrote:
>>>
>>>> Dear Philipp
>>>>
>>>>  We've tried to put our money where our mouth is so here is a rough
>>>> and ready version in RDF of the buy/sell example  as well as a diagram of
>>>> part of the example, as inspired by a more LMF type aproach:
>>>>
>>>>
>>>> https://docs.google.com/document/d/1dojqhFMHTswFWUarQAbeVo3ap6_UjDLyN9gTPydqr_g/edit
>>>>
>>>>  Cheers,
>>>>
>>>>  Fahad & Francesca
>>>>
>>>>
>>>> On 22 August 2014 10:37, Fahad Khan <anasfkhan81@gmail.com> wrote:
>>>>
>>>>> Dear Philipp,
>>>>> Sorry for the delay in responding,  we have been on holiday too the
>>>>> last couple of weeks.  We were planning to send something to the list
>>>>> before we went away, but it turns out the translation was harder to do than
>>>>> we thought (and our collective knowledge of lmf less comprehensive) and we
>>>>> weren't entirely happy with what we came up with.  However we will send you
>>>>> a slightly polished version of our proposed example next week before the
>>>>> telco -- after having hopefully discussed it with colleagues far more well
>>>>> versed in lmf than us.
>>>>> Cheers
>>>>> Fahad and Francesca
>>>>>   Dear all,
>>>>>
>>>>>    I returned from holidays end of last week. Given that some people
>>>>> are still on holidays, I propose we have our next telco on Friday 29th at
>>>>> the regular slot, i.e. 15:00 (CET). I will send out an announcement soon.
>>>>>
>>>>> @Fahad and Francesca: regarding our email thread before the holidays,
>>>>> would you please be so kind to send an example of the modelling of frames
>>>>> that is in your view appropriate, an LMF document would be fine for now so
>>>>> that we can study the LMF modelling in more detail in the next telco and
>>>>> then propose appropriate vocabulary elements in the synsem module to do the
>>>>> job. Starting from LMF seems a good idea to me as I mentione a few weeks
>>>>> ago.
>>>>>
>>>>> I will continue working with the vartrans and metadata modules from
>>>>> next week on until we receive the input form Fahad and Francesca to
>>>>> continue the work on the synsem module.
>>>>>
>>>>> I regard the ontolex and decomp modules as largely finished. Please
>>>>> check the ontologies and examples carefully so that we can soon agree to
>>>>> release them.
>>>>>
>>>>> Looking forward to continuing with our work.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Philipp.
>>>>>
>>>>> Am 02.08.14 18:46, schrieb Manuel Fiorelli:
>>>>>
>>>>>  Hi Philipp, All
>>>>>
>>>>>  sorry for the delayed response, which is in fact quite simple.  See
>>>>> below.
>>>>>
>>>>> 2014-08-01 11:53 GMT+02:00 Philipp Cimiano <
>>>>> cimiano@cit-ec.uni-bielefeld.de>:
>>>>>
>>>>>>
>>>>>> Am 01.08.14 00:10, schrieb Manuel Fiorelli:
>>>>>>
>>>>>>   My objection is that you split the description of the semantic
>>>>>> frame into two blocks. In each block, you associated the frame with
>>>>>> subframes, each one associating a semantic role with a syntactic argument.
>>>>>> Having these two blocks, I can easily understand that the semantic frame
>>>>>> has three roles, which maps to the syntactic arguments. Conversely, it I
>>>>>> consider these two blocks together, as they are in reality, then I am not
>>>>>> sure I can easily spot the "shape" of the semantic frame.
>>>>>>
>>>>>>    Yes, that is the only objection I can see so far as well. Let's
>>>>>> give a deeper look at this after the holidays, ok?
>>>>>>
>>>>>
>>>>>  I used the word "objection", which is quite a strong word. Maybe
>>>>> "observation" would have been a better choice. Nevertheless, I agree with
>>>>> you that we can continue the discussion after the holidays.
>>>>>
>>>>>  Meanwhile, happy holidays to everybody listening to this thread, and
>>>>> the rest of the OntoLex community :-D
>>>>>
>>>>>
>>>>> --
>>>>> --
>>>>> Prof. Dr. Philipp Cimiano
>>>>> AG Semantic Computing
>>>>> Exzellenzcluster für Cognitive Interaction Technology (CITEC)
>>>>> Universität Bielefeld
>>>>>
>>>>> Tel: +49 521 106 12249
>>>>> Fax: +49 521 106 6560
>>>>> Mail: cimiano@cit-ec.uni-bielefeld.de
>>>>>
>>>>> Office CITEC-2.307
>>>>> Universitätsstr. 21-25
>>>>> 33615 Bielefeld, NRW
>>>>> Germany
>>>>>
>>>>>
>>>>
>>>
>>
>
> --
> --
> Prof. Dr. Philipp Cimiano
> AG Semantic Computing
> Exzellenzcluster für Cognitive Interaction Technology (CITEC)
> Universität Bielefeld
>
> Tel: +49 521 106 12249
> Fax: +49 521 106 6560
> Mail: cimiano@cit-ec.uni-bielefeld.de
>
> Office CITEC-2.307
> Universitätsstr. 21-25
> 33615 Bielefeld, NRW
> Germany
>
>
Received on Friday, 29 August 2014 09:36:17 UTC