Re: A final set of issues with the specification from John McCrae on 2015-08-26 (public-ontolex@w3.org from August 2015)

From: John McCrae <john@mccr.ae>
Date: Wed, 26 Aug 2015 16:51:11 +0100
To: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Cc: public-ontolex <public-ontolex@w3.org>
Message-ID: <CAC5njqpc23zrJfEndgyNbz+PV3_zdtPJaAfAdaqx0+4E7HMTQA@mail.gmail.com>
On Wed, Aug 26, 2015 at 2:16 PM, Philipp Cimiano <
cimiano@cit-ec.uni-bielefeld.de> wrote:

> Hi John,
>
>    thanks for the summary of open issues. I comment on them...
>
> Am 24.07.15 um 13:37 schrieb John P. McCrae:
>
> Hi all,
>
> I made a thorough read-through of the specification and have some
> comments. There are five points that may be controversial and another
> *few* that should not be.
>
> *Important points*
>
> 1. We do not given the abbreviation of "lexicon model for ontologies" as
> "lemon" although the term lemon is used at several points in the document..
> Do we agree that the model is called "lexicon model for ontologies" and
> abbreviated as "OntoLex-Lemon"?
>
> Indeed, I propose we use the acronym lemon in the document, but in the
> introduction we should have the long name. I have fixed this already.
>
>
> 2. ontolex/example12 is very difficult to understand now that we have
> named this property "context" and not "usage". The idea that "riviere" can
> be extended with a usage note "A riviere is a river that flows into the
> sea" makes sense but it is not clear why the usage note is called a
> "context"... we need to either clearly justify this or rename the property
> to "usage". I prefer the latter option. (see also point 28)
>
>
> True, I propose to move this example down where we discuss the usage
> property.
>
There is no "usage" property, we renamed it to "context".

>
>
> 3. The vartrans:category "property indicates the specific type of a
> relation", we already have a property to do this namely rdf:type! It is
> not clear to me from the text why we need to redefine this property. (i.e..,
> either we need to better justify this or drop this property)
>
> No clear opinion about this yet.
>
>
> 4. Lime defines a number of properties that are of the form "the number of
> links from X to Y divided by the total number of X" for example
> lime:avgNumOfLexicalizations is "the number of links from references to
> lexical entries divided by the total number of references". This can be put
> into a table as follows:
>
> X/Y References Entries Concepts References - avgNumOfLexicalizations
> avgNumOfLinks Entries percentage - avgAmbiguity Concepts ? avgSynonymy -
>
>
> The table reveals a few inconsistencies in that we have a missing property
> and the percentage property should perhaps be named something like
> avgPolysemy
>
> 5. As the NIF "community" has not responded to our questions, we are
> forced to drop recommendations of linking using NIF, and instead only
> recommend OpenAnnotation.
>
>
> Not sure yet.
>
>
> *Not-so-important points*
>
> (JPM) means I will try and fix them within the next two weeks
>
> 6. "Document is structured into eight sections" only there are nine (JPM)
>
>
> Yes.
>
>
> 7. The first paragraph of the introduction is very academic, perhaps it
> could be rewritten to be more appealing to a general audience. (JPM)
>
>
> I am not sure about the "academic", but I am ok if you work on it.
>
>
> 8. "sublcass" and a number of other basic spelling errors exist throughout
> the document. We must spell-check the document! (JPM)
>
>
> Yes. I spotted some of those already today while doing a first pass over
> the document.
>
>
> 9. ontolex/example4 uses "/" around the IPA representations of the terms.
> I don't think that this is necessary. We should also explain the language
> tag and reference the IANA subtag catalogue.
>
>
> OK, can you please look into this.
>
>
> 10. There is little consistency about whether we write "lexical entry" or
> "LexicalEntry" or use a fixed-width font. (JPM I prefer the real English
> 'lexical entry')
>
> Yes, we should use small case here, that is 'lexical entry'.
>
> 11. Similarly we should check that terms like "rdfs:label" are always
> fixed-width (JPM)
>
> ok
>
> 12. "with canonical form the noun" !? (JPM)
>
> fixed
>
> 13. ontolex/example6 seems to duplicate ontolex/example1
>
>
> Not really. Becasue in example1 we did not have the writtenRep etc. So
> this example is incremental. I think it is fine.
>
"Lexical entries are further specialized into words, affixes (e.g., suffix,
prefix, infix or circumfix) and multiword expressions." then
ontolex/example1

"Of course, lexical entries need not to correspond to one word only, they
can correspond to a multi-word term, as the following example for the
lexical entry "intangible assets" shows:" then ontolex/example6

ontolex/example6 seems to repeat the point and it is not clear why it does,
could you revise the text before ontolex/example6?

>
>
> 14. We need an example showing how we represent abbreviations relative to
> their full forms (JPM)
>
>
> True, can you add one example...
>
>
> 15. In the definition of "other form" we should probably not say
> "non-dictionary" but "non-lemma". (JPM)
>
>
> Yes, agreed.
>
>
> 16. ontolex/example10 is still not good. The "bank" part of the example
> makes no sense as it is two separate entries with separate meanings, but it
> is not well explained why "bank" is two entries. The second part of this
> example uses the word "apothecary", which is a highly unusual word in
> English and I would not (personally) say is truly synonymous with
> "pharmacist". I had suggested using "troll" as the example here, but that
> seems not to have been adopted. Perhaps we also need a separate example
> explaining "bank" here too? (JPM)
>
> I think the example is fine. Why does "bank" make no sense? The example
> gives guidance to people about how to model multiple meanings of a word.
>
We don't explain why "bank" is two lexical entries and "apothecary"/"troll"
is one.

> The case of bank shows the case where there are two different entries for
> the word and both the lexical entries and the meanings are unrelated.
> The case of "apothecary" is the other case in which there is one lexical
> entry with two meanings.
>
> I am fine though if we replace the "apothecary" example by the "troll
> example".
>
> It seems that both meanings are indeed in DBpedia:
>
> http://de.dbpedia.org/page/Troll_(Mythologie)
> https://de.wikipedia.org/wiki/Troll_(Netzkultur)
> Ok then.
>
> 17. ontolex/example12 is listed in the text as synsem/example12! (JPM)
>
>
> ok.
>
>
> 18. Terms like 'Lexicon' and 'Lexical Entry' should not be capitalized
> they are not proper nouns (JPM)
>
>
> Yes.
>
>
> 19. The lexical concept can be better explained as follows: The reference
> in the ontology primarily gives an interpretation of a word in terms of the
> identifiers that would be generated by the semantic parsing of the
> sentence. For example if we were to understand the query "when did John
> Lennon die?" we may understand the word "die" as generating the URI
> dbpedia:deathDate within a SPARQL query. In contrast many resources will
> also wish to record the intentional meaning of the word with the mental
> lexicon, such as "die" referring to the concept of death, for this reason
> we introduce the class lexical concept which can be evoked by a lexical
> entry in place of or as well as a denotation in the ontology, e.g.,
>
>    :die a ontolex:Word ;
>      ontolex:denotes dbpedia:deathDate ;
>      ontolex:evokes  wordnet:Dying .
> (JPM)
>
>
> OK, but I would add this in addition to the explanation we have as an
> elaboration. I like the way you have phrased this.
>
>
> 20. Capitalization in definition of OntoMap is wrong. (JPM)
>
>
> Why is it wrong?
>
>
> 21. I don't like the paragraph 'An OntoMap resembles the
> SynSemCorrespondence...' as
> The OntoMap does not really resemble synsemcorrespondence
> I don't think we should compare to a closed standard like LMF that is
> unfamiliar to most of our audience
> Talking about semantic arguments will only create more confusion
>
>
> Well, this is a major issue that I will bring up soon. I indeed see the
> OntoMap as the ontolex counterpart to the SynSemCorrespondence. In fact, I
> will argue not to regard OntoMap as a subclass of Lexical Sense. But let us
> not open this box today... ;-)
>
I thought (actually hoped) this was closed too.

>
>
> 22. All "dbpedia:" URIs should be fixed width (JPM)
>
>
> This point is not clear to me, sorry.
>
Anything staring "dbpedia:" should be in fixed width

>
>
> 23. Some examples use "dbonto" and some "dbpedia"... inconsistent. (JPM)
>
>
> Well, there are different namespaces in DBpedia as well. Should we be more
> consistent that DBpedia? We could try to stick to the ontology namespace
> however...
>
We should be consistent, dbpedia: sometimes is short for
http://dbpedia.org/ontology/ and sometimes for http://dbpedia.org/resource/
and sometimes should be short for http://dbpedia.org/property/ but isn't

>
>
> 24. "The verb (to) launch" needs quotation marks (JPM)
>
>
> OK
>
>
> 25. "Complex ontology mappings / submappings" talks about semantic
> arguments but this is confusing
>
>
> Not sure why this is confusing. I still see the subject and object
> position of a triple as arguments of the triple. Maybe the term "semantic"
> is confusing here?
>
I want to remove any discussion of "semantic arguments" from this spec,
these will be introduced in a future module (per the most recent
agreement).

>
>
> 26. Indentation of synsem/example8 needs to be fixed (JPM)
>
>
> OK
>
>
> 27. "If element x decides if x"... this is not a maths paper, use English..
> (JPM)
>
> This comes from me. I though this makes it clear that with isA we refer to
> the lambda-abstracted variable of a lambda expression or to the argument of
> a function that characterizes the set. I find this quite clear and think
> that it is understandable as such. But we can add an English sentence that
> clarifies this a bit.
>
>
> 28. condition is defined as a subproperty of usage (JPM, see point 2)
>
> 29. "not found in many other languages" => "not found in some other
> languages and more importantly in some ontologies" (JPM)
>
>
> ok
>
>
> 30. I am not sure from a linguistic point of view that it is correct to
> say that "otitis" is composed of the affix "itis" in decomp/example3. In
> particular there is no Spanish word "ot" and "-itis" is a Greek inflection
> not a true suffix. An easier example would be with a normal prefix such as
> "un-", "re-" or "dis-"...
>
> Well, it is. It is clearer if we use the term "apendicitis" In which
> "itis" again means inflammation. "apendic" stands for appendix. Is that
> better?
>
"Appendic" is still not a word... we could choose an example which is
clearer, e.g., "un-".

>
>
>
> 31. It appears that order information has been added to decomp/example6....
> this is not necessary if we know that order of the words from the main
> entry and this representation actually saves a triple (ergo IMHO is
> superior!)
>
>     :AfricanSwineFever a ontolex:MultiwordExpression ;
>       rdf:_1 African_node ;
>       rdf:_2 Swine_node ;
>       rdf:_3 Fever_node .
>
>
> It does not hurt to add this information. Because the order is only
> implicit in the lexical entry. One would need to tokenize the lexical entry
> to get the order... Saving triples is not always good if one looses
> information that needs to be recovered...
>
Either way you have to recover some information. If you keep the example as
is then the tokenization of the lexical entry needs to be recovered, if you
switch to my model the parse order needs to be recovered, but tokenization
is more useful and efficient to represent.

>
>
> 32. "adjective -> adverb variation" not sure what "minus greater than"
> means here. (JPM change to arrow)
>
> 33. "Translation" section lists the "following ways [of representing
> translation] of increasing ontological strength"... but they are clearly
> not increasing! I am not really sure what ontological strength means.
>
>
> This comes from me. I will revise it.
>
>
> 34. The diagram for lime metadata needs to be updated. (JPM)
>
> 35. lime/example2 "jnp" => "jpn" (JPM)
>
> 36. I have a comment on "Verb form mood" that appears to never have been
> answered. I assume that my merge has no objections. (JPM)
>
> Regards,
> John
>
>
> --
> --
> Prof. Dr. Philipp Cimiano
> AG Semantic Computing
> Exzellenzcluster für Cognitive Interaction Technology (CITEC)
> Universität Bielefeld
>
> Tel: +49 521 106 12249
> Fax: +49 521 106 6560
> Mail: cimiano@cit-ec.uni-bielefeld.de
>
> Office CITEC-2.307
> Universitätsstr. 21-25
> 33615 Bielefeld, NRW
> Germany
>
>
Received on Thursday, 27 August 2015 07:39:51 UTC