Re: Call on the 10th of October, 14:00 CEST

Hi, Francesca, all:

Thank you very much for this example, Francesca. I have added two new
issues to the Lexicography Wiki page
<https://www.w3.org/community/ontolex/wiki/Lexicography> in relation to it,
along with a couple of insights on them and some possible solutions (e.g.
custom properties, decomp, etc.): complex forms, idioms, etc. related to a
particular lexical entry or a sense (*verre *-> *Maison de verre*, *Petit
verre*, *Verre double*) [Issue 9], and encyclopedic information [Issue 10].
It's also a very nice case with different usage examples [Issue 4] and
implicit logical order [Issue 6]. I have left out the aspects concerning
etymology for now, as I seem to remember that a new etymology module would
be proposed.

Thank you also for your comments, Fahad. We could have an issue 11 (sense
hierarchy) based on your work with the iLiddell Scott lexicon and the
*polyLemon* proposal for future telcos, what do you think? I was searching
for more examples for Issue 9 in the OED when I realized that what some
dictionaries consider related (or even unrelated entries) other regard them
as sub-senses of a sense (e.g. *fool around *in OED vs. *fool around* in
Merriam Webster's).

Talk to you tomorrow and best,

Julia



2017-10-09 14:42 GMT+02:00 Fahad Khan <anasfkhan81@gmail.com>:

> Hi Everyone
> I won't be able to make it to the skype call tomorrow but I just wanted to
> make a few points regarding the discussion last week.
>
> One of the main problems we had when we were thinking about converting the
> intermediate Liddell-Scott Greek-English lexicon (work we presented at the
> ontolex workshop in the summer) from the Perseus TEI encoding into linked
> data, was deciding what information to encode into RDF and what to leave
> out. The idea being that in the ontolex encoding you would include all the
> semantic information and then the rest is essentially just an artifact of
> the original print format that you can just leave as TEI.  It turns out
> that the distinction is not as easy to make as it first seems.
>
> For instance, iLiddell Scott in common with many other scholarly
> dictionaries of the era structures its entries in terms of layers, with the
> definitions and information becoming more detailed and specific the further
> down you get and some of the higher level senses seem there just to group
> together other sub-senses.  Of course you can take the senses listed under
> each entry and represent them using what is essentially a simple list
> enumeration, the lemon-ontolex default essentially-- but then you lose the
> hierarchical information. And the more we studied the dictionary the more
> we became convinced that this wasn't a wise thing to do -- although in
> order to be sure it would be necessary to consult a specialist in
> lexicography, Victorian dictionaries. But to us it seemed clear that
> removing some of this contextual information, sense scaffolding, would
> actually render the senses less clear, and less usable. In short we weren't
> sure whether in carrying out a simple, blunt encoding of the senses without
> hierarchical information the result  wouldn't just be a (flawed)
> interpretation of the information in the resource rather than actually
> being what we wanted to advertise it as: a linked data version of the
> iLiddell Scott (in the end we tried to keep as much of this hierarchical
> information as possible by adding new properties and classes).
>
> I'm bringing up sense hierarchies here not because I want to suggest that
> they should go in the model (although I do think they should). The point is
> in general it's not always clear what the thinking between the ranking and
> arrangement of senses (even if it's flat) in legacy lexical works is
> --e.g., it could be frequency, or the order of importance according to the
> lexicographer, temporal precedence  or some mixture of the above-- and this
> also applies to many other aspects of the formatting of lexicographic
> resources too -- and so it probably pays to be as agnostic as possible.  I
> also think we need more case studies in encoding legacy lexical resources
> (and I thank Francesca for hers) because I think these constitute an
> essential use case for any prospective dictionary model.
>
> Essentially in encoding these resources into LOD we want to make the
> information in them as accessible as possible in a way that's not possible
> with just TEI. So for example to take something I'm really interested in at
> the moment, how do you liberate etymological information encoded in legacy
> dictionaries in a way that allows you to write powerful queries about the
> origins or words, or that allow you to isolate the lexicon used by a
> particular author or in a given text (for instance the attestation
> information in the full Liddell Scott is extremely extensive and you can
> write some really useful queries even with the intermediate version that
> we've encoded)? It turns out this kind of information is very often
> ambigious and underspecified in dictionaries (old and new). So  how do you
> handle this? In many cases aside from conducting a seance you will never be
> able to ascertain why a given decision was made or what the lexicographer
> really meant, and so the modeler's uncertainty has to become a part of the
> model that's being developed; without becoming so unwieldy that it is
> essentially unusable.
>
> Cheers,
> Fahad
>
>
> On 9 October 2017 at 11:47, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.
> de> wrote:
>
>> Hi Francesca,
>>
>>  I think this is a great case study that we should defintely consider.
>> Can you present this briefly during the telco tomorrow?
>>
>> I will send a reminder later today.
>>
>> Philipp.
>>
>> Am 03.10.17 um 13:03 schrieb Francesca Frontini:
>>
>> Dear all,
>> I'm fine with Philipp's proposal, both as to the time and to the topic. I
>> would like to add a case study to the discussion.
>>
>> As some of you may know, here in Montpellier we have a projet aimed to
>> publish various old editions of the Petit Larousse Illustré.
>> First of all as TEI dict, but the idea is to make some of the information
>> enviable also as LOD.
>>
>> In this document, you can find an example of a lexical entry and some
>> questions.
>> https://docs.google.com/document/d/1TogPjrLyJS0OK5pzww28751M
>> X7179-NzCIsDdzae65o/edit
>>
>> I'be really grateful to have your opinion, in particular that of Julia
>> and Jorge who are already collecting exemples as Philip said.
>>
>> Best,
>> Francesca
>>
>>
>>
>> 2017-10-02 16:28 GMT+02:00 Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.
>> de>:
>>
>>> Dear all,
>>>
>>>  I propose we have another ontolex telco on the 10th of October, 14:00
>>> CEST.
>>>
>>> I propose we continue discussing the concrete examples that Julia and
>>> Jorge have been preparing.
>>>
>>> I think the conclusion we had is that we wanted to continue working
>>> bottom-up from examples of current lexica and then try to get an
>>> abstract model that is able to accomodate future dictionaries that are
>>> native LLD dictionaries.
>>>
>>> Let's try!
>>>
>>> We will again the meeting via skype. It worked quite well last time.
>>>
>>> Greetings,
>>>
>>> Philipp.
>>>
>>>
>>> --
>>> --
>>> Prof. Dr. Philipp Cimiano
>>> AG Semantic Computing
>>> Exzellenzcluster für Cognitive Interaction Technology (CITEC)
>>> Universität Bielefeld
>>>
>>> Tel: +49 521 106 12249
>>> Fax: +49 521 106 6560
>>> Mail: cimiano@cit-ec.uni-bielefeld.de
>>>
>>> Office CITEC-2.307
>>> Universitätsstr. 21-25
>>> 33615 Bielefeld, NRW
>>> Germany
>>>
>>>
>>>
>>>
>>
>> --
>> --
>> Prof. Dr. Philipp Cimiano
>> AG Semantic Computing
>> Exzellenzcluster für Cognitive Interaction Technology (CITEC)
>> Universität Bielefeld
>>
>> Tel: +49 521 106 12249 <+49%20521%2010612249>
>> Fax: +49 521 106 6560 <+49%20521%201066560>
>> Mail: cimiano@cit-ec.uni-bielefeld.de
>>
>> Office CITEC-2.307
>> Universitätsstr. 21-25
>> 33615 Bielefeld, NRW
>> Germany
>>
>>
>


-- 

Julia Bosque Gil
PhD Student
Ontology Engineering Group <http://www.oeg-upm.net/>
Departamento de Inteligencia Artificial
Universidad Politécnica de Madrid

Received on Monday, 9 October 2017 18:05:22 UTC