- From: Manuel Fiorelli <fiorelli@info.uniroma2.it>
- Date: Thu, 5 Jun 2014 21:41:21 +0200
- To: Armando Stellato <stellato@info.uniroma2.it>
- Cc: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>, "public-ontolex@w3.org" <public-ontolex@w3.org>
- Message-ID: <CAGDmdGgOb4ViDcQ4Q+Ovf892mJDeCYQ2X1P-7s2EybQaYsC1-Q@mail.gmail.com>
Hello Armando, the attached OWL model should be correct with respect to the question you raised. 2014-06-05 20:39 GMT+02:00 Armando Stellato <stellato@info.uniroma2.it>: > Just a small errata corrige: > > LexicalizedLinkSet extends void:LinkSet (though yes, then extends in turn > void:Dataset as well). > > Thanks a lot Manuel, > > Armando > > > > > > *From:* Manuel Fiorelli [mailto:manuel.fiorelli@gmail.com] > *Sent:* Thursday, June 5, 2014 7:12 PM > *To:* Philipp Cimiano > *Cc:* Armando Stellato; public-ontolex@w3.org > *Subject:* Re: lexicalization count > > > > Dear Philipp, > > > I attached to this email an initial OWL model representing the LIME > metadata vocabulary. > > Let me summarize the model. > > The central entity is now the lime:Lexicalization which: > > - provides lexicalizations for an RDF datasets (i.e., a collection of > linguistic attachments); > - in one natural language; > - using one ore more linguistic models (as long as they are used to > express the SAME information, up to the expressive power of each model); > - possibly referencing a given OntoLex Lexicon. > > In the proposed model, the properties from a lexicalization to the target > dataset and the lexicon are functional. > > Since in some usage scenarios we start with an ontology and we want to > discover lexicalizations for it, we also provide a property that connects > any void:Dataset to known lexicalizations. > > Each lexicalization refers to various ResourceCoverage(s), which provide > statistics for different types of resources found in the target dataset. > > Currently, we have not committed to a specific set of statistics, > therefore I just introduced the ones already mentioned by Philipp. With > respect to his proposal, I slightly changed the domain of some of the > properties. For instance, I do not believe that references (the count of > distinct references) should be applicable to a Lexicon. > > As already said by Armando, we have a distinct class (lime:LexicalLinkset) > for expressing the association between a dataset and a conceptualized > linguistic resource (e.g., WordNet). In fact, this association is close to > a mapping relation, thus we decided to introduce a distinct class > LexicalLinkset that extends the standard class void:Dataset. However, we do > believe that preserving the distinction may be useful. > > Despite we have removed the our categorization of linguistic resources, I > reintroduced the class ConceputualizedLinguisticResource, which should be > used in conjunction with LexicalLinkset. > > > > In the proposed model, I recreated some classes, such as Lexicon, that > already exists in the core OntoLex model. We should decide, whether they > are the same class or not. > > Another interesting point of discussion is our choice of providing two > properties: > > - lang, which indicates the natural language a given lexicalication > refers to > - language, which is a shortcut to allow a dataset saying: I know > there is a lexicalization for me in this natural language. > > We should discuss whether these two properties are required, and in case > which of them unify with ontolex:language. > > > > 2014-06-04 19:37 GMT+02:00 Armando Stellato <stellato@info.uniroma2.it>: > > Dear Philipp, > > > > sorry for not catching up earlier with this email. Just came back from > LREC and departed immediately for another conf in Taiwan which I’m still > attending. Writing “nightwise”… > > > > Ok, so, as a first thing, I had a long call with Manuel just now. He will > send in the next days something that you can publish on GIT. Obviously, > everthing under discussion, but it is just a starting point to have it open > and accessible on GIT. > > > > I anticipate here some replies to your email: > > > > // counting properties (datatype properties, with domain (ontolex:Lexicon > OR ontolex:Lexicalization OR void:Dataset OR lime:LanguageCoverage) > > lime:numberOfLexicalEntries > lime:numberOfSenses > lime:numberOfLexicalizations (denote-tirples) > lime:numberOfReferences -> the number of distinct references used > > We then need to discuss whether we should also include ratios etc. > > > > As said before, we would prefer to use simple names (in the spirit of > analoguous properties on void), such as lexicalEntris, senses, > lexicalizations, references. Small note: not so sure if to keep the > ambiguity “Lexicalization” (as a dataset of lexicalizations) and > “lexicalization” as an attachment. It creates then ambiguitirs like the > property “lexicalizations” (as number of attachments) and “lexicalization” > as pointer to a Lexicalization. > > But, for the moment, let’s stick with them. > > > > > > Then: > > lime:language (unified with ontolex:language, extended here to domain > lime:LanguageCoverage > > > > Not sure I got it exactly the above. Btw, we will present two different > props, and then check what can be unified. > > > > lime:linguisticModel: describing by which model/vocabulary information > about lexicalization is attached; the domain is void:Dataset and the range > is the URI of the vocabulary; lime:linguisticModel is thus a subproperty of > void:vocabulary > > > > Fine. One note here: we saw now that in the PDF about LIME we sent before, > there is one thing that we resolved in one chapter, and left obsolete in > one other. > > Wrt our LIME paper, there is no more languageCoverage (and thus, even no > need of changing it to lexicalCoverage as we wrojngly left said at the end > of page 2 ) as it has been replaced by Lexicalization. Inside a > Lexicalization, we may specify different ResourceCoverage, that is, various > “cuts” of coverage for different ontology types (e.g. the coverage for > classes, or for properties, or for skos:Concepts ). > > This also simplifies the terminology (though, as said before, > Lexicalization clashes with the name of its own contained attachments). > > > > One more point: we left open the problem of addressing links to > LexicalConcepts of conceptualized lexical resources (e.g. wordnet). > > We just resolved it in a decently elegant way. A lexicalization only deals > with attachments between OWL/SKOS dataset/vocabulary (the “onto” part) and > senses or lexical entris of a lexicon. > > Attachments to lexicalconcepts (the ones we called lexicalResourceCoverage > in our paper) will be dealt in a different way (as it is only implicitly a > lexicalization), though reusing existing stuff from void. > > We would coin the class: LexicalLinkSet as a subclass of void:LinkSet, and > it would be used to express the links above. > > > > Note that several linguisticModels can co-exist in principle in a > dataset... > > > > Sure. More precisely, an “onto” Dataset may specify more (known) > Lexicalizations . each lexicalizations refers to only one language. One > (onto) dataset may have more than one lexicalization per language > (obviously); this maybe due to different models being available, or simply > to different lexicons being available and linked to the same (onto) dataset. > > We were thinking (for more compactness) to allow for the specification of > more linguisticmodels for the same lexicalization, whenever *exactly* the > same lexical content is available (in the same lexicalization). For > instance, if SKOSXL and materialized SKOS labels and RDFS labels are > available inside the same physical dataset representing a lexicalization, > then it is possible to specify them as alternative models inside the same > Lexicalization instance. > > > > > lime:type: providing a type for the resource in question, e.g. bilingual > lexicon, lexicon, ..., domain is void:Dataset and range is not specified > > > > eheh, ok, you know our point of view, so better we leave you and John > discussing on what is intopic or offtopic inside OntoLex, then only in the > first case, we can give our contribution ;) > > > > lime:languageCoverage with domain void:Datase and range > lime:LanguageCoverage. > > > > ime:LanguageCoverage has a language, a linguistic Model and all the > counting properties above are defined for it. > > > > ok, replaced by Lexicalization, see above (and also all pages of PDF, > except page 2). > > > > Think that’s all. Manuel will follow with a specification via email, so > that you can put it on GIT. > > > > Sorry, I will be unable to participate on (still in conference). > > > > Cheers, > > > > Armando (and Manuel from call ;) ) > > > > > -- > Manuel Fiorelli > -- Manuel Fiorelli PhD student in Computer and Automation Engineering Dept. of Civil Engineering and Computer Science University of Rome "Tor Vergata" Via del Politecnico 1 00133 Roma, Italy tel: +39-06-7259-7334 skype: fiorelli.m
Received on Thursday, 5 June 2014 19:41:50 UTC