Re: Comments on lime.owl from Manuel Fiorelli on 2014-06-06 (public-ontolex@w3.org from June 2014)

From: Manuel Fiorelli <fiorelli@info.uniroma2.it>
Date: Fri, 6 Jun 2014 14:44:51 +0200
To: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>
Cc: public-ontolex <public-ontolex@w3.org>
Message-ID: <CAGDmdGh1V2n3PFqpLWOiyi0fCsZou5x=iQ8iMJyCLnaCq_p6oA@mail.gmail.com>
Hi John,

see my answers below.

2014-06-06 13:49 GMT+02:00 John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>:

> Hi Manuel, Armando, all,
>
> Some comments on the lime.owl file
>
>    - Should Lexicon, Lexicalization and co. really be subclasses of
>    void:Dataset. A void:Dataset is defined as a "set of RDF triples that are
>    published, maintained or aggregated by a single provider". Thus, it seems
>    that many lexica and lexicalizations can be in the same dataset and
>    conversely it is very hard to define which triples are in a lexicon (for
>    example interlingual links are shared between two lexica). It would make
>    more sense to me to have lexica, lexicalizaitons, etc., as part of a
>    dataset, but not as datasets themselves.
>
> We have in principle three distinct datasets: the dataset being
lexicalized, the lexicon providing the vocabulary and the lexicalization
which relates them.

These distinctions do not entail that the dataset are really disjoint. In
fact, the VoID vocabulary introduces the subset relation, which relate a
dataset to its parts. For instance, we could have a dataset, which has
different subsets, corresponding to different linksets with other datasets.
In this scenario, the datasets and its linksets may share the same SPARQL
endpoint. However, knowing in advance that there exist some subsets that
provide interlinks may be useful: it is exactly the reason we use the LOD
cloud diagram.

The key idea behind the concept of void:Dataset is to provide metadata that
provide useful information about the actual data they refer to. In a sense,
a void:Dataset should provide information that help to understand the
usefulness of the data, to interpret the data, and so on.

>
>    - What is a "conceptualized linguistic resource"? This is not really
>    clear to me.
>
> Not sure about the name, but the idea was to refer to any resource like
WordNet: that is a resource providing lexical concepts grouping
semantically close senses of different words.

>
>    - How does a "lexical linkset" differ from a "linkset"? (i.e, do we
>    need this class?)
>
> It is a specialization, that seemed useful to us, to highlight the
"special nature" of the dataset for which we are providing links.

>
>    - What is the range of lime:class? How does it differ from void:class?
>
> The range is rdfs:Class. The different lies in the domain. Indeed, the
domain of void:class is dataset, while lime:class has domain
ResourceCoverage.

>
>    - Shouldn't there be an object property linking a lexicalization to an
>    ontology?
>
> It is lexicalizedDataset. In our parlance, we refer to dataset to embrace
both factual knowledge and domain descriptions.

>
>    - 'language' is already in the core OntoLex model, do we need it in
>    lime?
>
> We wrote that the unification of this property with the corresponding one
in OntoLex will be a point of discussion.

>
>    - How do you count lexicalizations? i.e., is it the number of
>    Lexicalization instances or the number of lexicalized reference/entry pairs.
>
> There is a slight ambiguity with regard to this. A Lexicalization is
really a collection of reference/entry pairs, which are individually
referred to as lexicalizations (uncapitalized initial).

If this ambiguity is unacceptable, we could consider alternative names for
the Lexicalization class. Perhaps, LexicalMapping or LexicoSemanticMapping,
or whatever sensible name.

>
>    - What are the domains of the properties lexicalEntries, senses,
>    references, etc.?
>
> In the owl file you should have the following information:

   - lexicalEntries -> Lexicalization or ResourceCoverage or Lexicon
   - senses -> Lexicalization or ResourceCoverage or Lexicon
   - lexicalizations -> Lexicalization or ResourceCoverage
   - references -> Lexicalization or ResourceCoverage


>    - Shouldn't we also count LexicalConcepts and Forms?
>
> As I wrote in the previous email, we are open to suggestions about
additional statistics.

-- 
Manuel Fiorelli
PhD student in Computer and Automation Engineering
Dept. of Civil Engineering and Computer Science
University of Rome "Tor Vergata"
Via del Politecnico 1
00133 Roma, Italy

tel: +39-06-7259-7334
skype: fiorelli.m
Received on Friday, 6 June 2014 12:45:19 UTC