Re: lexical resources with n-ary translations

Hi John,

> If the argument is just that the current modelling is more verbose then  
> I would not be too keen to revise the model here. There will always be  
> use cases >that can be modelled more efficiently with specific  
> constructs, but on the other hand, a standard model such as OntoLex also  
> has to take into account >users of the model, who wish to have a  
> consistent way to query and work with the model.

Well, it's not a big change, and it's fully backward-compatible, it is  
just to no longer require that vartrans:target is functional, then. It  
would make a difference for reasoning (but that's not the purpose of  
OntoLex anyway), but for querying none at all, you simply get multiple  
variable bindings when querying for the vartrans:target of a Translation  
... and you would get exactly the same when these are encoded by multiple  
Translations, except that the translations have all their individual URIs  
(and duplicate metadata/links to TranslationSet). From a modelling  
perspective, this doesn't do harm -- it only leads to a better (more  
compact) representation. The question is whether this is important enough  
to justify a revision of the current text on its own right. (Most likely  
not.) But in case the core model will get updated anyway at some point,  
this is a possible change that I would like to see discussed then, and I  
would like to have it recorded for this purpose. (There are some more ;)

> Of course, if the community feels that this is important enough to allow  
> multiple modelling options we can consider this.

Yes. The effective question is how important the modelling of  
directionality is in this case and how drastic the impact of verbosity  
*for a real use case*. For terminologies, directionality (as recorded in  
the dictionary) may actually be artificial (even in the example I  
provided, every entry is in addition identified by a number, which can be  
taken to represent the concept). But there may be other kinds of  
resources. This could be relevant for dictionaries such as Meyer-Lübke  
(1911, https://catalog.hathitrust.org/Record/001182487), where the  
direction of inheritance from one proto-language into a large number of  
descendant needs to be recorded, all with the same metadata (but with a  
different kind of lexico-semantic relation than translation). On the other  
hand, while a number of people have expressed an interest in etymology  
(including myself), I'm also not sure how much impact this kind of data  
should have on the model as a whole ... It's really up to the community.

Best,
Christian

>
> Regards,
> John
>
> Ar Aoine 26 Meith 2020 ag 13:56, scríobh Christian Chiarcos  
> <christian.chiarcos@web.de>:
>> Hi John,
>>
>> thanks for the recap.
>>
>> Am Fr., 26. Juni 2020 um 12:56 Uhr schrieb John P. McCrae  
>> <john@mccr.ae>:
>>> Hi Christian,
>>>
>>> Much of this was discussed during the development of the vartrans  
>>> module, but I will try to recap:
>>>
>>> https://www.w3.org/2016/05/ontolex/#translation
>>>
>>> The main way to represent translations is by 'shared reference' that  
>>> is using a single concept for entries in multiple languages.
>>
>> Yes. The context of discussing translation with multiple targets was  
>> for a situation where a direction between one source language  
>> expression and >>multiple target language expressions (into different  
>> target languages) need to be recorded. This is not possible with shared  
>> reference (because this >>comes without directionality) and I see no  
>> way to do that in OntoLex other than encoding the relation between each  
>> source and target expression pair >>independently. Semantically, this  
>> is fine, but it is verbose.
>>
>> Take the first entry of https://www.springer.com/de/book/9789020116670  
>> and assume that we wanted to encode that English is the source (first  
>> entry, >>defines organization of dictionary) that is translated into  
>> four languages:
>> Abelian group | abelsche Gruppe | groupe abélien | abelse groep |  
>> абелева группа
>>
>> criteria:number of triples and objects
>> +-direction (encodes translation direction)
>> +- metadata (allows to provide source metadata)
>>
>> OntoLex modelling
>> Shared reference: 1 ontolex:LexicalConcept, 5 ontolex:evokes properties  
>> (-direction, +metadata [at concept])
>> vartrans:translation (non-reified): 4 vartrans:translation properties  
>> (+direction, -metadata)
>> vartrans:Translation (reified): 4 vartrans:Translation objects, 8  
>> vartrans:relates (vartrans:source/vartrags:target) properties  
>> (+direction, +metadata)
>>
>> alternative modelling
>> *vartrans:Translation (n-ary): 1 vartrans:Translation object, 5  
>> vartrans:relates (1 vartrans:source, 4 vartrags:target) properties  
>> (+direction, +metadata)
>>
>> The difference here is reified binary Translations require at least  
>> double as many triples as reified n-ary Translations and I can see why  
>> people would >>like to avoid that. However, this is an issue only if  
>> the direction of translation needs to be recorded, because in terms of  
>> space complexity, shared >>reference is equivalent with n-ary  
>> Translation.
>> If such a use case for multilingual data with one source language and a  
>> large number of target languages does exists, where direction matters  
>> and >>verbosity is an issue, it is probably safest to *not* use the  
>> OntoLex vocabulary directly, but to create my:NAryTranslation as a  
>> subclass of >>ontolex:LexicalConcept with properties my:source and  
>> my:target as subproperties of ontolex:isEvokedBy and use these instead  
>> of the vartrans >>vocabulary.
>>
>> Cheers,
>> Christian

Received on Monday, 29 June 2020 11:03:13 UTC