- From: Manuel Fiorelli <fiorelli@info.uniroma2.it>
- Date: Fri, 6 Jun 2014 14:44:51 +0200
- To: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>
- Cc: public-ontolex <public-ontolex@w3.org>
- Message-ID: <CAGDmdGh1V2n3PFqpLWOiyi0fCsZou5x=iQ8iMJyCLnaCq_p6oA@mail.gmail.com>
Hi John, see my answers below. 2014-06-06 13:49 GMT+02:00 John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>: > Hi Manuel, Armando, all, > > Some comments on the lime.owl file > > - Should Lexicon, Lexicalization and co. really be subclasses of > void:Dataset. A void:Dataset is defined as a "set of RDF triples that are > published, maintained or aggregated by a single provider". Thus, it seems > that many lexica and lexicalizations can be in the same dataset and > conversely it is very hard to define which triples are in a lexicon (for > example interlingual links are shared between two lexica). It would make > more sense to me to have lexica, lexicalizaitons, etc., as part of a > dataset, but not as datasets themselves. > > We have in principle three distinct datasets: the dataset being lexicalized, the lexicon providing the vocabulary and the lexicalization which relates them. These distinctions do not entail that the dataset are really disjoint. In fact, the VoID vocabulary introduces the subset relation, which relate a dataset to its parts. For instance, we could have a dataset, which has different subsets, corresponding to different linksets with other datasets. In this scenario, the datasets and its linksets may share the same SPARQL endpoint. However, knowing in advance that there exist some subsets that provide interlinks may be useful: it is exactly the reason we use the LOD cloud diagram. The key idea behind the concept of void:Dataset is to provide metadata that provide useful information about the actual data they refer to. In a sense, a void:Dataset should provide information that help to understand the usefulness of the data, to interpret the data, and so on. > > - What is a "conceptualized linguistic resource"? This is not really > clear to me. > > Not sure about the name, but the idea was to refer to any resource like WordNet: that is a resource providing lexical concepts grouping semantically close senses of different words. > > - How does a "lexical linkset" differ from a "linkset"? (i.e, do we > need this class?) > > It is a specialization, that seemed useful to us, to highlight the "special nature" of the dataset for which we are providing links. > > - What is the range of lime:class? How does it differ from void:class? > > The range is rdfs:Class. The different lies in the domain. Indeed, the domain of void:class is dataset, while lime:class has domain ResourceCoverage. > > - Shouldn't there be an object property linking a lexicalization to an > ontology? > > It is lexicalizedDataset. In our parlance, we refer to dataset to embrace both factual knowledge and domain descriptions. > > - 'language' is already in the core OntoLex model, do we need it in > lime? > > We wrote that the unification of this property with the corresponding one in OntoLex will be a point of discussion. > > - How do you count lexicalizations? i.e., is it the number of > Lexicalization instances or the number of lexicalized reference/entry pairs. > > There is a slight ambiguity with regard to this. A Lexicalization is really a collection of reference/entry pairs, which are individually referred to as lexicalizations (uncapitalized initial). If this ambiguity is unacceptable, we could consider alternative names for the Lexicalization class. Perhaps, LexicalMapping or LexicoSemanticMapping, or whatever sensible name. > > - What are the domains of the properties lexicalEntries, senses, > references, etc.? > > In the owl file you should have the following information: - lexicalEntries -> Lexicalization or ResourceCoverage or Lexicon - senses -> Lexicalization or ResourceCoverage or Lexicon - lexicalizations -> Lexicalization or ResourceCoverage - references -> Lexicalization or ResourceCoverage > - Shouldn't we also count LexicalConcepts and Forms? > > As I wrote in the previous email, we are open to suggestions about additional statistics. -- Manuel Fiorelli PhD student in Computer and Automation Engineering Dept. of Civil Engineering and Computer Science University of Rome "Tor Vergata" Via del Politecnico 1 00133 Roma, Italy tel: +39-06-7259-7334 skype: fiorelli.m
Received on Friday, 6 June 2014 12:45:19 UTC