- From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
- Date: Wed, 15 Jul 2015 08:06:58 +0200
- To: Manuel Fiorelli <manuel.fiorelli@gmail.com>
- CC: "public-ontolex@w3.org" <public-ontolex@w3.org>
- Message-ID: <55A5F882.6080104@cit-ec.uni-bielefeld.de>
Hi Manuel, replying to this, todos from last Friday.... Am 07.07.15 um 15:55 schrieb Manuel Fiorelli: > Dear Philipp, All > > here are my preliminary comments. Most of them are minor typos, while > other may seed further discussion. > > ----- > > In the introduction to example 1, the spec says: > > "As an example we may describe a simple lexicon using this property as > well as properties from Dublin Core and VoID: " > > The example then contains also the actual lexical entries that > constitute the lexicon. This is good for what concerns the > self-explanatory nature of the example. However, we should make clear > that in general the metadata only deals with the description of the > lexicon as a whole, while the representation of its actual content is > in the scope of other modules. This is particularly relevant to > "lexicon catalogs", which may only be interested in indexing lexicons > without the need to also host the actual content. > I kept the example as is but added a sentence that makes clear that the metadata describes the lexicon as a whole as suggested by you. > ----- > > In the definition of LexicalizationSet, the classes Lexicon and > Dataset need, respectively, the prefix ontolex and void. Fixed > > ----- > > I am not sure about this statement: > > "The lexicalization set object should be unique for a given > lexicon-ontology pair" > > Indeed, the statement above imply that there cannot be two different > lexicalization sets for FOAF using the WordNet RDF lexicon. I think > that this conclusion is false, so the previous statement should be > retracted. > This has been removed. > ----- > > In the definition of lexicalizationModel, the disjunction is spelled > OR, whereas in other cases it is spelled in lowercase. has been fixed by you I guess, thanks. > > ----- > > The definition of lime:references does not mention the fact that in a > lexical linkset an ontology reference can be associated with a lexical > concept. In order to avoid overlading, I would prefer to keep "references" as referring to the distinct number of resources ?o, that is: # of different ?o such that (?s,reference,?o) > > ----- > > Concerning Example2: > - we should add the language "ja" to the lexicalizationSet resource > - we may say that the ontology is an instance of voaf:Vocabulary, > which is a subclass of void:Dataset to represent vocabularies (both > RDFS Schemas and OWL Ontologies) > - I would extend the introduction to the example. This is my attempt: > > <cite> > In the following example, we describe a lexicalization set expressing > how elements of an ontology can be verbalized in Japanese by means of > entries from a supplied lexicon. The metadata clearly tells which > ontology and lexicon are involved in the lexicalization sets, as well > as the relevant natural language. The knowledge of these facts about > the lexicalization set allows us to assess the usefulness of a > lexicalization set for a given task as well to discover relevant > lexicalization sets, when we are constrained by the choice of an > ontology, lexicon or natural language. > > We model the ontology as an instance of the class voaf:Vocabulary that > is a kind of void:Dataset representing vocabularies (bot RDFS Schemas > and OWL Ontologies). We benefit from the more specific distinctions > made by VOAF, by breaking down the total number of entities in the > ontology (held by the property void:entities) into separate counts for > the classes and properties (held by voaf:classNumber and > voaf:propertyNumber, respectively). > > Similarly, we use terms from the Lime vocabulary to represent > statistics about the linguistic content of the lexicon and the > lexicalization set. Overall, the ontology defines 80 entities and the > lexicon 100 lexical entries; however, only 20 entities from the target > ontologies have been associated with a total of 50 lexical entries. > </cite> > > ----- Great, I have added your text to the example. > > In the definition of avgNumOfLexicalizations, it occurs the word > "define" while it should be "defines". I can not find this, sorry. But this brings me to another issues. The formula for avgNumOfLexicalizations could be improved to make it clearer as follows: avgNumOfLexicalizations = # lexicalizations / # ontology entities in the reference dataset What do you think? Can you possibly update the formula? That would be great. Thanks. > > ----- > > I would postpone example 3 to end of the section, and I would modify > it as follows: > - reuse the same data as in example 2, and make this clear in the > introduction to the example > - then, use the properties lexicalizations, avgNumOfLexicalizations > and percentage to "analyze" the scenario depicted in example 2. For > instance, it is now possible to tell explicitly that only 25% of the > reference ontology has been lexicalized. > > We can make the example more interesting playing with polisemy so that > the ratios are not "obvious". Actually, I think that example 3 makes definitely sense here. The ratios are rather obvious, true, but this is good as a simple and clear example. > > ----- > > In the definition of LexicalLinkset, the class dataset needs the > prefix void. > > ----- > OK, this has been fixed as far as I see. > I would propose the following example for lime:ConceptualizationSet > > :WnConceptualizationSet a lime:ConceptualizationSet ; > lime:conceptualDataset :WnConceptSet ; > lime:lexiconDataset :WnLexicon ; > lime:lexicalEntries 155287 ; > lime:concepts 117659 ; > lime:conceptualizations 206941 ; > lime:avgPolisemy 1.33 > . > > For the statistics, I referred to this page: > https://wordnet.princeton.edu/wordnet/man/wnstats.7WN.html > > We should discuss whether and how: > > * to represent monosemous words > * to break down the statistics with respect to different part of > speech tags > > Regards > > Manuel > > > 2015-07-07 15:02 GMT+02:00 Philipp Cimiano > <cimiano@cit-ec.uni-bielefeld.de > <mailto:cimiano@cit-ec.uni-bielefeld.de>>: > > Dear all, > > I went through the lime module today, streamlining the > definitions etc. to make them more conformant to the rest of the > modules. I also updated the ontology. I will go through all > sections asking for comments on Friday. > > Please send me any comments you deem important by Friday. > > I still need to work through the examples both in the wiki and the > git repo. It seems to me that we need a few additional examples in > this section. > > Kind regards, > > Philipp. > > -- > -- > Prof. Dr. Philipp Cimiano > AG Semantic Computing > Exzellenzcluster für Cognitive Interaction Technology (CITEC) > Universität Bielefeld > > Tel: +49 521 106 12249 <tel:%2B49%20521%20106%2012249> > Fax: +49 521 106 6560 <tel:%2B49%20521%20106%206560> > Mail: cimiano@cit-ec.uni-bielefeld.de > <mailto:cimiano@cit-ec.uni-bielefeld.de> > > Office CITEC-2.307 > Universitätsstr. 21-25 > 33615 Bielefeld, NRW > Germany > > > > > > -- > Manuel Fiorelli -- -- Prof. Dr. Philipp Cimiano AG Semantic Computing Exzellenzcluster für Cognitive Interaction Technology (CITEC) Universität Bielefeld Tel: +49 521 106 12249 Fax: +49 521 106 6560 Mail: cimiano@cit-ec.uni-bielefeld.de Office CITEC-2.307 Universitätsstr. 21-25 33615 Bielefeld, NRW Germany
Received on Wednesday, 15 July 2015 06:07:30 UTC