- From: Manuel Fiorelli <fiorelli@info.uniroma2.it>
- Date: Thu, 5 Jun 2014 19:24:19 +0200
- To: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
- Cc: Armando Stellato <stellato@info.uniroma2.it>, "public-ontolex@w3.org" <public-ontolex@w3.org>
- Message-ID: <CAGDmdGh3Act-yL15O-i_kBZYyxO5tWhB57gG5-GfUTJ6h81R4A@mail.gmail.com>
Ops... sorry again. I attached the owl model. 2014-06-05 19:23 GMT+02:00 Manuel Fiorelli <fiorelli@info.uniroma2.it>: > Dear list, > > sorry for double posting. However, I sent the original email from my gmail > account, and the message could be delayed for two days, since it was the > first time I used that account. > > > 2014-06-05 19:11 GMT+02:00 Manuel Fiorelli <manuel.fiorelli@gmail.com>: > > Dear Philipp, >> >> I attached to this email an initial OWL model representing the LIME >> metadata vocabulary. >> >> Let me summarize the model. >> >> The central entity is now the lime:Lexicalization which: >> >> - provides lexicalizations for an RDF datasets (i.e., a collection of >> linguistic attachments); >> - in one natural language; >> - using one ore more linguistic models (as long as they are used to >> express the SAME information, up to the expressive power of each model); >> - possibly referencing a given OntoLex Lexicon. >> >> In the proposed model, the properties from a lexicalization to the target >> dataset and the lexicon are functional. >> >> Since in some usage scenarios we start with an ontology and we want to >> discover lexicalizations for it, we also provide a property that connects >> any void:Dataset to known lexicalizations. >> >> Each lexicalization refers to various ResourceCoverage(s), which provide >> statistics for different types of resources found in the target dataset. >> >> Currently, we have not committed to a specific set of statistics, >> therefore I just introduced the ones already mentioned by Philipp. With >> respect to his proposal, I slightly changed the domain of some of the >> properties. For instance, I do not believe that references (the count of >> distinct references) should be applicable to a Lexicon. >> As already said by Armando, we have a distinct class >> (lime:LexicalLinkset) for expressing the association between a dataset and >> a conceptualized linguistic resource (e.g., WordNet). In fact, this >> association is close to a mapping relation, thus we decided to introduce a >> distinct class LexicalLinkset that extends the standard class void:Dataset. >> However, we do believe that preserving the distinction may be useful. >> >> Despite we have removed the our categorization of linguistic resources, I >> reintroduced the class ConceputualizedLinguisticResource, which should be >> used in conjunction with LexicalLinkset. >> >> In the proposed model, I recreated some classes, such as Lexicon, that >> already exists in the core OntoLex model. We should decide, whether they >> are the same class or not. >> >> Another interesting point of discussion is our choice of providing two >> properties: >> >> - lang, which indicates the natural language a given lexicalication >> refers to >> - language, which is a shortcut to allow a dataset saying: I know >> there is a lexicalization for me in this natural language. >> >> We should discuss whether these two properties are required, and in case >> which of them unify with ontolex:language. >> >> >> 2014-06-04 19:37 GMT+02:00 Armando Stellato <stellato@info.uniroma2.it>: >> >> Dear Philipp, >>> >>> >>> >>> sorry for not catching up earlier with this email. Just came back from >>> LREC and departed immediately for another conf in Taiwan which I’m still >>> attending. Writing “nightwise”… >>> >>> >>> >>> Ok, so, as a first thing, I had a long call with Manuel just now. He >>> will send in the next days something that you can publish on GIT. >>> Obviously, everthing under discussion, but it is just a starting point to >>> have it open and accessible on GIT. >>> >>> >>> >>> I anticipate here some replies to your email: >>> >>> >>> >>> // counting properties (datatype properties, with domain >>> (ontolex:Lexicon OR ontolex:Lexicalization OR void:Dataset OR >>> lime:LanguageCoverage) >>> >>> lime:numberOfLexicalEntries >>> lime:numberOfSenses >>> lime:numberOfLexicalizations (denote-tirples) >>> lime:numberOfReferences -> the number of distinct references used >>> >>> We then need to discuss whether we should also include ratios etc. >>> >>> >>> >>> As said before, we would prefer to use simple names (in the spirit of >>> analoguous properties on void), such as lexicalEntris, senses, >>> lexicalizations, references. Small note: not so sure if to keep the >>> ambiguity “Lexicalization” (as a dataset of lexicalizations) and >>> “lexicalization” as an attachment. It creates then ambiguitirs like the >>> property “lexicalizations” (as number of attachments) and “lexicalization” >>> as pointer to a Lexicalization. >>> >>> But, for the moment, let’s stick with them. >>> >>> >>> >>> >>> >>> Then: >>> >>> lime:language (unified with ontolex:language, extended here to domain >>> lime:LanguageCoverage >>> >>> >>> >>> Not sure I got it exactly the above. Btw, we will present two different >>> props, and then check what can be unified. >>> >>> >>> >>> lime:linguisticModel: describing by which model/vocabulary information >>> about lexicalization is attached; the domain is void:Dataset and the range >>> is the URI of the vocabulary; lime:linguisticModel is thus a subproperty of >>> void:vocabulary >>> >>> >>> >>> Fine. One note here: we saw now that in the PDF about LIME we sent >>> before, there is one thing that we resolved in one chapter, and left >>> obsolete in one other. >>> >>> Wrt our LIME paper, there is no more languageCoverage (and thus, even no >>> need of changing it to lexicalCoverage as we wrojngly left said at the end >>> of page 2 ) as it has been replaced by Lexicalization. Inside a >>> Lexicalization, we may specify different ResourceCoverage, that is, various >>> “cuts” of coverage for different ontology types (e.g. the coverage for >>> classes, or for properties, or for skos:Concepts ). >>> >>> This also simplifies the terminology (though, as said before, >>> Lexicalization clashes with the name of its own contained attachments). >>> >>> >>> >>> One more point: we left open the problem of addressing links to >>> LexicalConcepts of conceptualized lexical resources (e.g. wordnet). >>> >>> We just resolved it in a decently elegant way. A lexicalization only >>> deals with attachments between OWL/SKOS dataset/vocabulary (the “onto” >>> part) and senses or lexical entris of a lexicon. >>> >>> Attachments to lexicalconcepts (the ones we called >>> lexicalResourceCoverage in our paper) will be dealt in a different way (as >>> it is only implicitly a lexicalization), though reusing existing stuff from >>> void. >>> >>> We would coin the class: LexicalLinkSet as a subclass of void:LinkSet, >>> and it would be used to express the links above. >>> >>> >>> >>> Note that several linguisticModels can co-exist in principle in a >>> dataset... >>> >>> >>> >>> Sure. More precisely, an “onto” Dataset may specify more (known) >>> Lexicalizations . each lexicalizations refers to only one language. One >>> (onto) dataset may have more than one lexicalization per language >>> (obviously); this maybe due to different models being available, or simply >>> to different lexicons being available and linked to the same (onto) dataset. >>> >>> We were thinking (for more compactness) to allow for the specification >>> of more linguisticmodels for the same lexicalization, whenever *exactly* >>> the same lexical content is available (in the same lexicalization). For >>> instance, if SKOSXL and materialized SKOS labels and RDFS labels are >>> available inside the same physical dataset representing a lexicalization, >>> then it is possible to specify them as alternative models inside the same >>> Lexicalization instance. >>> >>> >>> >>> >>> lime:type: providing a type for the resource in question, e.g. bilingual >>> lexicon, lexicon, ..., domain is void:Dataset and range is not specified >>> >>> >>> >>> eheh, ok, you know our point of view, so better we leave you and John >>> discussing on what is intopic or offtopic inside OntoLex, then only in the >>> first case, we can give our contribution ;) >>> >>> >>> >>> lime:languageCoverage with domain void:Datase and range >>> lime:LanguageCoverage. >>> >>> >>> >>> ime:LanguageCoverage has a language, a linguistic Model and all the >>> counting properties above are defined for it. >>> >>> >>> >>> ok, replaced by Lexicalization, see above (and also all pages of PDF, >>> except page 2). >>> >>> >>> >>> Think that’s all. Manuel will follow with a specification via email, so >>> that you can put it on GIT. >>> >>> >>> >>> Sorry, I will be unable to participate on (still in conference). >>> >>> >>> >>> Cheers, >>> >>> >>> >>> Armando (and Manuel from call ;) ) >>> >>> >> >> >> -- >> Manuel Fiorelli >> > > > > -- > Manuel Fiorelli > PhD student in Computer and Automation Engineering > Dept. of Civil Engineering and Computer Science > University of Rome "Tor Vergata" > Via del Politecnico 1 > 00133 Roma, Italy > > tel: +39-06-7259-7334 > skype: fiorelli.m > -- Manuel Fiorelli PhD student in Computer and Automation Engineering Dept. of Civil Engineering and Computer Science University of Rome "Tor Vergata" Via del Politecnico 1 00133 Roma, Italy tel: +39-06-7259-7334 skype: fiorelli.m
Attachments
- application/rdf+xml attachment: lime.owl
Received on Thursday, 5 June 2014 17:24:47 UTC