- From: Manuel Fiorelli <manuel.fiorelli@gmail.com>
- Date: Fri, 24 Jul 2015 17:24:07 +0200
- To: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>
- Cc: public-ontolex <public-ontolex@w3.org>
- Message-ID: <CAGDmdGjOURdnDfn4qXVd6xiPW5mhKxw4-aROOhtH9nQ=CML6dQ@mail.gmail.com>
Hi John, All thank you for your precious review of the specification. Let me start addressing some of the "more controversial" points you have raised. 2015-07-24 13:37 GMT+02:00 John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>: > 1. We do not given the abbreviation of "lexicon model for ontologies" as > "lemon" although the term lemon is used at several points in the document. > Do we agree that the model is called "lexicon model for ontologies" and > abbreviated as "OntoLex-Lemon"? > I like the name *Lemon*, so I am inclined to agree with this name. Not sure if the hyphen is required, tough. > 4. Lime defines a number of properties that are of the form "the number of > links from X to Y divided by the total number of X" for example > lime:avgNumOfLexicalizations is "the number of links from references to > lexical entries divided by the total number of references". This can be put > into a table as follows: > > X/YReferencesEntriesConceptsReferences-avgNumOfLexicalizations > avgNumOfLinksEntriespercentage-avgAmbiguityConcepts?avgSynonymy- > > > The table reveals a few inconsistencies in that we have a missing property > and the percentage property should perhaps be named something like > avgPolysemy > The various statistics have been defined considering that we have two sets *A* and *B* and a set *Pairs* of pairs (a,b) ∈ AxB. We have various integer counts for: - the total number of pairs = |Pair| - the a's that occur in at least one pair = |{a ∈ A | ∃ b ∈ B . (a,b) ∈ Pairs)}| - the b's that occur in at least one pair = |{b ∈ B | ∃ a ∈ A . (a,b) ∈ Pairs)}| (I used the symbol "|" to express the cardinality of each set, while in the spec we sometimes use the fragment #) For the "ratios", we have chosen a "preferential direction" (maybe this not the right expression, or even the directions might be expressed in the opposite manner), say from A to B: - from the Ontology to the Lexicon in the case of a LexicalizationSet - from the Ontology to the ConceptSet in the case of a LexicalLinkset - from the Lexicon to the ConceptSet in the case of a ConceptualizationSet Given these viewpoints, we gave the following ratios: - percentage = ratio of elements in A that participate in at least one pair (in other words, that have been associated with at least one b in B) - avgNumOfXXX = average number of b in B associated with each a in A (XXX is Lexicalizations, Links, Conceptualizations) For the ConceptualizationSet we followed a slightly different approach: - dropped percentage; - renamed avgNumOfConceptualizations into avgAmbiguity - and added, avgSynonymy, which plays the role of avgNumOfXXX if we assume the opposite point of view (i.e. counting how many lexical entries are associated with each lexical concept) Answering your questions: - percentage is not the same as avgPolisemy, avgAmbiguity, avgSynonymy - except for ConceptualizationSet, we need the ratios in the opposite direction that the one we assumed. - in fact, we could also consider the addition of a property analogous to percentage giving the ratios of participants in B The problem with the introduction of avgNumOfXXX in the opposite direction is that the current properties avgNumOfLexicalizations and avgNumOfLinks are in fact ambiguous and their interpretation has been arbitrarily fixed by assuming at the denominator the ontology entities. Therefore, I suspect that the introduction of the missing properties would force us to change the names of the already existing ones: it is not a case, to me, that in end for the conceptualization set we decided to use avgPolisemy and avgSynonymy, dropping avgConceptualizations altogether. I really like avgPolisemy and avgSynonymy, which could be applied as well to LexicalizationSets, but I think they cannot be applied to LexicalLinkset (or at least, their interpretation could not be immediately clear, because we are relating two "semantic resources") In general, I remember that we agreed not to use the term "*polisemy*" because it has a precise meaning in linguistics, and we don't want to deal at this level with the issue polysemy/homonymy. Now let me address some of the "not-so-important points": > 23. Some examples use "dbonto" and some "dbpedia"... inconsistent. (JPM) > In DBpedia there are (at least) two namespaces that should be associated with two distinct prefixes: http://dbpedia.org/resource/ --> eg. http://dbpedia.org/resource/Rome http://dbpedia.org/ontology/ --> eg. http://dbpedia.org/ontology/birthPlace DBpedia uses the prefixes *dbpedia* and *dbpedia-owl*, respectively (though I don't completely like the latter). If have not verified if the specification uses resources from both namespaces (thus two prefixes are necessary) or if it only uses resources from one namespace (thus a single prefix should be used). -- Manuel Fiorelli
Received on Friday, 24 July 2015 15:24:36 UTC