Re: lime module from Philipp Cimiano on 2015-07-15 (public-ontolex@w3.org from July 2015)

From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Date: Wed, 15 Jul 2015 08:09:31 +0200
To: Manuel Fiorelli <manuel.fiorelli@gmail.com>
CC: "public-ontolex@w3.org" <public-ontolex@w3.org>
Message-ID: <55A5F91B.6060305@cit-ec.uni-bielefeld.de>
Hi Manuel,

  let me comment on this email for the sake of completeness...

Am 10.07.15 um 15:27 schrieb Manuel Fiorelli:
> Dear Philipp,
>
> here are additional comments on the Lime module.
>
> *Section "lexicon metadata"*
>
> Just before the definition box of /linguistic model/:
>
> "We may also specify the linguistic (annotation model) used in a 
> lexicon with the linguistic model property"

has been fixed as far as I see
>
> I think that the word "model" should go outside the parenthesis. 
> Additionally, I would make it clearer that we are talking about things 
> such as part of speech, number, gender, and so... maybe also by 
> pointing to the section of the specification where we wrote explicitly 
> that.
>
> *Section "Lexicalization Set"*
>
> "In RDF, a lexicalization is expressed via the property rdfs:label."

> It should be "In RDFS" (note I added an S).

has been fixed, thanks
>
> *Section "Partitions"*
>
> "many cases, we want to provide descriptive metadata about a subset of 
> a lexicallization"
>
> it should be "of a lexicalization set"
>
Fixed, I also changed the domain and range of partition to 
"Lexicalization Set".

> *Section "Publication Strategies"*
>
>
> "For example, this allows lexicalizing lexical concepts from an 
> existing wordnet in a different natural language than the one for 
> which the resource was initially conceived"
>
> I am unsure that it is appropriate to use the word "lexicalizing" in 
> association with lexical concepts, because we insisted that the nature 
> of a "conceptualization set" is different from that of a 
> "lexicalization set"
>
> Best regards
>
> Manuel
>
> 2015-07-07 15:55 GMT+02:00 Manuel Fiorelli <manuel.fiorelli@gmail.com 
> <mailto:manuel.fiorelli@gmail.com>>:
>
>     Dear Philipp, All
>
>     here are my preliminary comments. Most of them are minor typos,
>     while other may seed further discussion.
>
>     -----
>
>     In the introduction to example 1, the spec says:
>
>     "As an example we may describe a simple lexicon using this
>     property as well as properties from Dublin Core and VoID: "
>
>     The example then contains also the actual lexical entries that
>     constitute the lexicon. This is good for what concerns the
>     self-explanatory nature of the example. However, we should make
>     clear that in general the metadata only deals with the description
>     of the lexicon as a whole, while the representation of its actual
>     content is in the scope of other modules. This is particularly
>     relevant to "lexicon catalogs", which may only be interested in
>     indexing lexicons without the need to also host the actual content.
>
>     -----
>
>     In the definition of LexicalizationSet, the classes Lexicon and
>     Dataset need, respectively, the prefix ontolex and void.
>
>     -----
>
>     I am not sure about this statement:
>
>     "The lexicalization set object should be unique for a given
>     lexicon-ontology pair"
>
>     Indeed, the statement above imply that there cannot be two
>     different lexicalization sets for FOAF using the WordNet RDF
>     lexicon. I think that this conclusion is false, so the previous
>     statement should be retracted.
>
>     -----
>
>     In the definition of lexicalizationModel, the disjunction is
>     spelled OR, whereas in other cases it is spelled in lowercase.
>
>     -----
>
>     The definition of lime:references does not mention the fact that
>     in a lexical linkset an ontology reference can be associated with
>     a lexical concept.
>
>     -----
>
>     Concerning Example2:
>     - we should add the language "ja" to the lexicalizationSet resource
>     - we may say that the ontology is an instance of voaf:Vocabulary,
>     which is a subclass of void:Dataset to represent vocabularies
>     (both RDFS Schemas and OWL Ontologies)
>     - I would extend the introduction to the example. This is my attempt:
>
>     <cite>
>     In the following example, we describe a lexicalization set
>     expressing how elements of an ontology can be verbalized in
>     Japanese by means of entries from a supplied lexicon. The metadata
>     clearly tells which ontology and lexicon are involved in the
>     lexicalization sets, as well as the relevant natural language. The
>     knowledge of these facts about the lexicalization set allows us to
>     assess the usefulness of a lexicalization set for a given task as
>     well to discover relevant lexicalization sets, when we are
>     constrained by the choice of an ontology, lexicon or natural
>     language.
>
>     We model the ontology as an instance of the class voaf:Vocabulary
>     that is a kind of void:Dataset representing vocabularies (bot RDFS
>     Schemas and OWL Ontologies). We benefit from the more specific
>     distinctions made by VOAF, by breaking down the total number of
>     entities in the ontology (held by the property void:entities) into
>     separate counts for the classes and properties (held by
>     voaf:classNumber and voaf:propertyNumber, respectively).
>
>     Similarly, we use terms from the Lime vocabulary to represent
>     statistics about the linguistic content of the lexicon and the
>     lexicalization set. Overall, the ontology defines 80 entities and
>     the lexicon 100 lexical entries; however, only 20 entities from
>     the target ontologies have been associated with a total of 50
>     lexical entries.
>     </cite>
>
>     -----
>
>     In the definition of avgNumOfLexicalizations, it occurs the word
>     "define" while it should be "defines".
>
>     -----
>
>     I would postpone example 3 to end of the section, and I would
>     modify it as follows:
>     - reuse the same data as in example 2, and make this clear in the
>     introduction to the example
>     - then, use the properties lexicalizations,
>     avgNumOfLexicalizations and percentage to "analyze" the scenario
>     depicted in example 2. For instance, it is now possible to tell
>     explicitly that only 25% of the reference ontology has been
>     lexicalized.
>
>     We can make the example more interesting playing with polisemy so
>     that the ratios are not "obvious".
>
>     -----
>
>     In the definition of LexicalLinkset, the class dataset needs the
>     prefix void.
>
>     -----
>
>     I would propose the following example for lime:ConceptualizationSet
>
>     :WnConceptualizationSet a lime:ConceptualizationSet ;
>       lime:conceptualDataset :WnConceptSet ;
>       lime:lexiconDataset :WnLexicon ;
>       lime:lexicalEntries 155287 ;
>       lime:concepts 117659 ;
>       lime:conceptualizations 206941 ;
>       lime:avgPolisemy 1.33
>       .
>
>     For the statistics, I referred to this page:
>     https://wordnet.princeton.edu/wordnet/man/wnstats.7WN.html
>
>     We should discuss whether and how:
>
>       * to represent monosemous words
>       * to break down the statistics with respect to different part of
>         speech tags
>
>     Regards
>
>     Manuel
>
>
>     2015-07-07 15:02 GMT+02:00 Philipp Cimiano
>     <cimiano@cit-ec.uni-bielefeld.de
>     <mailto:cimiano@cit-ec.uni-bielefeld.de>>:
>
>         Dear all,
>
>          I went through the lime module today, streamlining the
>         definitions etc. to make them more conformant to the rest of
>         the modules. I also updated the ontology. I will go through
>         all sections asking for comments on Friday.
>
>         Please send me any comments you deem important by Friday.
>
>         I still need to work through the examples both in the wiki and
>         the git repo. It seems to me that we need a few additional
>         examples in this section.
>
>         Kind regards,
>
>         Philipp.
>
>         -- 
>         --
>         Prof. Dr. Philipp Cimiano
>         AG Semantic Computing
>         Exzellenzcluster für Cognitive Interaction Technology (CITEC)
>         Universität Bielefeld
>
>         Tel: +49 521 106 12249 <tel:%2B49%20521%20106%2012249>
>         Fax: +49 521 106 6560 <tel:%2B49%20521%20106%206560>
>         Mail: cimiano@cit-ec.uni-bielefeld.de
>         <mailto:cimiano@cit-ec.uni-bielefeld.de>
>
>         Office CITEC-2.307
>         Universitätsstr. 21-25
>         33615 Bielefeld, NRW
>         Germany
>
>
>
>
>
>     -- 
>     Manuel Fiorelli
>
>
>
>
> -- 
> Manuel Fiorelli

-- 
--
Prof. Dr. Philipp Cimiano
AG Semantic Computing
Exzellenzcluster für Cognitive Interaction Technology (CITEC)
Universität Bielefeld

Tel: +49 521 106 12249
Fax: +49 521 106 6560
Mail: cimiano@cit-ec.uni-bielefeld.de

Office CITEC-2.307
Universitätsstr. 21-25
33615 Bielefeld, NRW
Germany
Received on Wednesday, 15 July 2015 06:10:09 UTC