Re: Teleconference on Friday

Dear all,

  first of all thanks to Armando and Manuel for resending their slides 
and for the very clear exposition of these slides during the last telco 
in April. That was indeed very enlightening.

Given the exposition, I myself am inclined to both accept the 
Lexicalization as well as the ratios as "fist-class citizens".

In any case, let me make a suggestion for what to decide today, we can 
look at the details during the telco of course, but let me try to 
structure the discussion a bit:

1) ontolex:Lexicon (recommend properties such as creator, version etc. 
from dc and dcat as recommended vocabulary to express general metadata), 
in addition to numerical properties such as: i) number of lexical 
entries, ii) number of senses, iii) number of distinct references, iv) 
number of references that have at least one sense (lexical entry), v) 
percentage of references that have at least one sense (one 
lexicalization so to speak), vi) average number of lexicalization 
(senses) per reference

One question: is this relative to the lexicon or taking into account all 
the data elements in the lexicalized dataset

2) lime:Lexicon (lexicon as dataset), see 3 below

with main property lime:lexicalCoverage (Armando already hinted in this 
slides that we could rename LanguageCoverage to LexicalCoverage and 
correspondingly languageCoverage to lexicalCoverage I suppose?)

a LexicalCoverrage class would essentially state for each language and 
each type of lexicon ontology interface model (SKOS, lemon, RDF labels 
etc.) the number of conceptual resources covered by at least one lexical 
entry, the average number of lexical entries per conceptual resource etc.

3) Introduce lime:Lexicon and lime:Lexicalization as subclasses of 
void:Dataset in the lime module

4) I think the (sort of) agreement during our last telco was to have the 
ratios/percentages in addition to the absolute numbers as we agreed that 
the absolute numbers can not always be re-computed exactly from the 
ratios. We should reach consensus here.

My opinion is that introducing a few ratio properties will simplify 
accessing this information by people who want to use the lexicon. 
Re-computing this information might be difficult sometimes; not everyone 
speaks SPARQL, not always endpoints are up etc etc. Some ontologies to 
not have endpoints, so people would need to download the data, load it 
into some OWL Api, count the number of individuals, classes etc. quite 
tedious if you are just a user of SW technology ;-) So +1 from my side 
to include some ratios then.

So including this information in the lexicon might indeed be a useful 
addition.

However, I see some issues about *how* to count the number of conceptual 
resources, particularly in the case that there are more than one 
"lexicalized datasets" per lexicon. In this case we might want to 
provide the information per dataset or even per domain, which blows up 
the complexity again substantially.

5) One question is whether we include *also* in the model the 
information that allows to recompute the ratios as well, that would 
include that we provide both: i) number of conceptual resources in the 
lexicalized dataset(s) - which can be more than one, and ii) number of 
conceptual resources covered by at least one lexicalization. In addition 
to the ratio.

In this case the ratio would be redundant, so be it. In any case could 
define these properties and monitor which ones are used ;-) We could 
recommend using both the integers and the ratios as good practice.

If we agree on the above points, I volunteer to create a small example 
with Armando on the wiki to aid the discussion.

Talk to you later anyway!

Philipp.


Am 15.05.14 18:17, schrieb Armando Stellato:
>
> Hi Philipp,
>
> Just a short recap from Manuel and me about the only part which to us 
> seemed appended: the ratio/percentage vs count. We do not report 
> anything about the model as, at best of our memories, there were no 
> objections about the overall structure (which does not mean it is 
> necessarily the final one, and it is still open for comments).
>
> We thus updated the previous document with some considerations (also 
> taken from the last ontolex call we had) and reported them in section: 5
>
> Please, feel free to add more on the "integer side", so we already 
> have a basis for discussion tomorrow.
>
> Cheers,
>
> Armando and Manuel
>
> > -----Original Message-----
>
> > From: Philipp Cimiano [mailto:cimiano@cit-ec.uni-bielefeld.de]
>
> > Sent: Wednesday, May 14, 2014 9:26 PM
>
> > To: public-ontolex@w3.org
>
> > Subject: Teleconference on Friday
>
> >
>
> > Dear all,
>
> >
>
> >    I would like to call for a telco on this Friday on our regular slot:
>
> > 15:00 (CET).
>
> >
>
> > The main goal is to discuss the metadata module and come to a 
> conclusion.
>
> >
>
> > I will send some decision points out before the meeting on Friday.
>
> >
>
> > Access details can be found here as usual:
>
> > 
> https://www.w3.org/community/ontolex/wiki/Teleconference,_2014.16.05, 
> <https://www.w3.org/community/ontolex/wiki/Teleconference,_2014.16.05,_15-16_pm_CET>
>
> > _15-16_pm_CET 
> <https://www.w3.org/community/ontolex/wiki/Teleconference,_2014.16.05,_15-16_pm_CET>
>
> >
>
> > I look forward to talking to you on Friday.
>
> >
>
> > Best regards,
>
> >
>
> > Philipp.
>
> >
>
> > --
>
> >
>
> > Prof. Dr. Philipp Cimiano
>
> >
>
> > Phone: +49 521 106 12249
>
> > Fax: +49 521 106 12412
>
> > Mail: cimiano@cit-ec.uni-bielefeld.de 
> <mailto:cimiano@cit-ec.uni-bielefeld.de>
>
> >
>
> > Forschungsbau Intelligente Systeme (FBIIS) Raum 2.307 Universität 
> Bielefeld
>
> > Inspiration 1
>
> > 33619 Bielefeld
>


-- 

Prof. Dr. Philipp Cimiano

Phone: +49 521 106 12249
Fax: +49 521 106 12412
Mail: cimiano@cit-ec.uni-bielefeld.de

Forschungsbau Intelligente Systeme (FBIIS)
Raum 2.307
Universität Bielefeld
Inspiration 1
33619 Bielefeld

Received on Friday, 16 May 2014 10:58:22 UTC