- From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
- Date: Wed, 15 Jul 2015 08:27:42 +0200
- To: public-ontolex@w3.org
- Message-ID: <55A5FD5E.8050305@cit-ec.uni-bielefeld.de>
Hi Manuel, thanks, see below ... Am 13.07.15 um 18:22 schrieb Manuel Fiorelli: > Dear Philipp, All > > Following our discussion on the LIME module during the last telco, > here are some updates on the specification: > > https://www.w3.org/community/ontolex/wiki/index.php?title=Final_Model_Specification&diff=2289&oldid=2250 > > The spec has been modified to address some of the issues I have raised > in previous emails (see details below within the quoted text). > > The diagram on Draw.io has been modified, considering the current > state of the Lime metadata vocabulary. Further modifications could be > required once you decided what to do with the properties to renamed or > split. > > Some examples were added to the end of the metadata module, but we > will revise them in the next days. We modified some definitions, but > others have not been modified because of the possibility they could be > split or renamed. Specifically, here are some definitions (or axioms) > to be modified: > > *lime:lexicalEntries* > > - The domain of this property should be Lexicon or LexicalizationSet > or Conceptualization and the definition should be changed accordingly, > unless we want to split this property into two or more properties. > I changed the property definition to also include ConceptualizationSet as domain. You mean ConceptualizationSet, right? > *lime:referenceDataset* > > - the definition should be reviewed > For me the definition is fine, what exactly should be reviewed? > > > *lime:lexicalizationModel* > > - the domain should not include ontolex:Lexicon (this could be a > refuse remained after the introduction of lime:linguisticModel) > > OK, fixed... > *lime:references* > > - Not sure if this will be split or renamed > > See my other email on this, I propose that for the sake of clarity and avoid overloading we keep this property as denoting the number of distinct ?o in triples (?s,reference,?o) > *lime:percentages* > > - in the definition, we should add the mention to lexical linksets > I changed this as follows: The '''percentage''' property expresses the percentage of entities in the reference dataset which have at least one lexicalization in a lexicalization set or are linked to a lexical concept in a lexical linkset. Fine? > > *lime:partition* > > - the definition of partition is wrong, as it only refers to > lexicalization sets > > > *lime:resourceType* > > - as before, it only mentions lexicalization sets > OK, thanks. I changed the definitions. Are they fine now? > *lime:concepts* > > the introduction to the definition of lime:concepts firstly mention > its use in a concept set, although we are in the section about > lexicalLinkset > > OK, I introduced a pointer to the definition of ConceptSet in ontolex. Fine? > *lime:avgNumOfLinks* > > - the definition is wrong. This property should give the average > number of links per ontology entity > I changed the definition to: The '''average number of links''' property indicates the average number of links to a concept for each ontology element in the reference dataset. > > > 2015-07-10 15:27 GMT+02:00 Manuel Fiorelli <manuel.fiorelli@gmail.com > <mailto:manuel.fiorelli@gmail.com>>: > > > *Section "lexicon metadata"* > > Just before the definition box of /linguistic model/: > > "We may also specify the linguistic (annotation model) used in a > lexicon with the linguistic model property" > > I think that the word "model" should go outside the parenthesis. > Additionally, I would make it clearer that we are talking about > things such as part of speech, number, gender, and so... maybe > also by pointing to the section of the specification where we > wrote explicitly that. > > > DONE > > > *Section "Lexicalization Set"* > > "In RDF, a lexicalization is expressed via the property rdfs:label." > > It should be "In RDFS" (note I added an S). > > > DONE > > > *Section "Partitions"* > > "many cases, we want to provide descriptive metadata about a > subset of a lexicallization" > > it should be "of a lexicalization set" > > > Still TODO. Actually, the paragraph and the definition of the property > should be extended as well to incorporate both lexical linkset and > lexicalization sets. > > > *Section "Publication Strategies"* > > > "For example, this allows lexicalizing lexical concepts from an > existing wordnet in a different natural language than the one for > which the resource was initially conceived" > > I am unsure that it is appropriate to use the word "lexicalizing" > in association with lexical concepts, because we insisted that the > nature of a "conceptualization set" is different from that of a > "lexicalization set" > > > DONE during the telco. > > > 2015-07-07 15:55 GMT+02:00 Manuel Fiorelli > <manuel.fiorelli@gmail.com <mailto:manuel.fiorelli@gmail.com>>: > > Dear Philipp, All > > here are my preliminary comments. Most of them are minor > typos, while other may seed further discussion. > > ----- > > In the introduction to example 1, the spec says: > > "As an example we may describe a simple lexicon using this > property as well as properties from Dublin Core and VoID: " > > The example then contains also the actual lexical entries that > constitute the lexicon. This is good for what concerns the > self-explanatory nature of the example. However, we should > make clear that in general the metadata only deals with the > description of the lexicon as a whole, while the > representation of its actual content is in the scope of other > modules. This is particularly relevant to "lexicon catalogs", > which may only be interested in indexing lexicons without the > need to also host the actual content. > > > WON'T FIX. We decided that the example is fine, and it may be the case > that further examples (only concerning with metadata) should be added > later in the spec. > > ----- > > In the definition of LexicalizationSet, the classes Lexicon > and Dataset need, respectively, the prefix ontolex and void. > > ----- > > > DONE. > > > I am not sure about this statement: > > "The lexicalization set object should be unique for a given > lexicon-ontology pair" > > Indeed, the statement above imply that there cannot be two > different lexicalization sets for FOAF using the WordNet RDF > lexicon. I think that this conclusion is false, so the > previous statement should be retracted. > > > TODO. I think that we agreed on removing that sentence, but I leave > the honor to the editors. > > ----- > > In the definition of lexicalizationModel, the disjunction is > spelled OR, whereas in other cases it is spelled in lowercase. > > ----- > > > TODO. Actually, the misspelling has been corrected, but I think that > we should remove ontolex:Lexicon entirely, because that property only > applied to lexicalization sets. Concerning lexicons, we should use the > related property lime:linguisticModel. > > > The definition of lime:references does not mention the fact > that in a lexical linkset an ontology reference can be > associated with a lexical concept. > > > TODO. Actually, you mentioned the possibility that properties such as > references and concepts could be split. > > ----- > > Concerning Example2: > - we should add the language "ja" to the lexicalizationSet > resource > - we may say that the ontology is an instance of > voaf:Vocabulary, which is a subclass of void:Dataset to > represent vocabularies (both RDFS Schemas and OWL Ontologies) > > > DONE the addition of the language to the lexicalization set as well > the addition of the lexicalization model. > > - I would extend the introduction to the example. This is my > attempt: > > <cite> > In the following example, we describe a lexicalization set > expressing how elements of an ontology can be verbalized in > Japanese by means of entries from a supplied lexicon. The > metadata clearly tells which ontology and lexicon are involved > in the lexicalization sets, as well as the relevant natural > language. The knowledge of these facts about the > lexicalization set allows us to assess the usefulness of a > lexicalization set for a given task as well to discover > relevant lexicalization sets, when we are constrained by the > choice of an ontology, lexicon or natural language. > > We model the ontology as an instance of the class > voaf:Vocabulary that is a kind of void:Dataset representing > vocabularies (bot RDFS Schemas and OWL Ontologies). We benefit > from the more specific distinctions made by VOAF, by breaking > down the total number of entities in the ontology (held by the > property void:entities) into separate counts for the classes > and properties (held by voaf:classNumber and > voaf:propertyNumber, respectively). > > Similarly, we use terms from the Lime vocabulary to represent > statistics about the linguistic content of the lexicon and the > lexicalization set. Overall, the ontology defines 80 entities > and the lexicon 100 lexical entries; however, only 20 entities > from the target ontologies have been associated with a total > of 50 lexical entries. > </cite> > > > TODO. I prefer that this addition is applied by the editors. > > ----- > > In the definition of avgNumOfLexicalizations, it occurs the > word "define" while it should be "defines". > > > DONE > > ----- > > I would postpone example 3 to end of the section, and I would > modify it as follows: > - reuse the same data as in example 2, and make this clear in > the introduction to the example > - then, use the properties lexicalizations, > avgNumOfLexicalizations and percentage to "analyze" the > scenario depicted in example 2. For instance, it is now > possible to tell explicitly that only 25% of the reference > ontology has been lexicalized. > > We can make the example more interesting playing with polisemy > so that the ratios are not "obvious". > > > TODO > > ----- > > In the definition of LexicalLinkset, the class dataset needs > the prefix void. > > > DONE > > ----- > > I would propose the following example for > lime:ConceptualizationSet > > :WnConceptualizationSet a lime:ConceptualizationSet ; > lime:conceptualDataset :WnConceptSet ; > lime:lexiconDataset :WnLexicon ; > lime:lexicalEntries 155287 ; > lime:concepts 117659 ; > lime:conceptualizations 206941 ; > lime:avgPolisemy 1.33 > . > > For the statistics, I referred to this page: > https://wordnet.princeton.edu/wordnet/man/wnstats.7WN.html > > > DONE > > We should discuss whether and how: > > * to represent monosemous words > * to break down the statistics with respect to different > part of speech tags > > > WON'T DO. Actually, we thought that this use case could be supported > by partitions of the conceptualization sets, but various technical > difficulties made us desist :-D > > > Regards, > > Manuel Fiorelli -- -- Prof. Dr. Philipp Cimiano AG Semantic Computing Exzellenzcluster für Cognitive Interaction Technology (CITEC) Universität Bielefeld Tel: +49 521 106 12249 Fax: +49 521 106 6560 Mail: cimiano@cit-ec.uni-bielefeld.de Office CITEC-2.307 Universitätsstr. 21-25 33615 Bielefeld, NRW Germany
Received on Wednesday, 15 July 2015 06:29:37 UTC