Re: A final set of issues with the specification from John McCrae on 2015-09-04 (public-ontolex@w3.org from September 2015)

From: John McCrae <john@mccr.ae>
Date: Fri, 4 Sep 2015 10:28:33 +0100
To: Elena Montiel Ponsoda <elemontiel@gmail.com>
Cc: public-ontolex <public-ontolex@w3.org>, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Message-ID: <CAC5njqpGTob1pOJ9QMxMhfSBLmiX7mWQvhOFVVu85L=G8euzeQ@mail.gmail.com>
On Wed, Sep 2, 2015 at 2:45 PM, Elena Montiel Ponsoda <elemontiel@gmail.com>
wrote:

> Dear all,
>
> Hope you all had a great holiday time.
> Thanks for this summary. Between lines, some comments by Lupe and myself.
>
> Best,
> Elena
>
> El 26/08/2015 a las 17:51, John McCrae escribió:
>
>
>
> On Wed, Aug 26, 2015 at 2:16 PM, Philipp Cimiano <
> <cimiano@cit-ec.uni-bielefeld.de>cimiano@cit-ec.uni-bielefeld.de> wrote:
>
>> Hi John,
>>
>>    thanks for the summary of open issues. I comment on them....
>>
>> Am 24.07.15 um 13:37 schrieb John P. McCrae:
>>
>> Hi all,
>>
>> I made a thorough read-through of the specification and have some
>> comments. There are five points that may be controversial and another
>> *few* that should not be.
>>
>> *Important points*
>>
>> 1. We do not given the abbreviation of "lexicon model for ontologies" as
>> "lemon" although the term lemon is used at several points in the document.
>> Do we agree that the model is called "lexicon model for ontologies" and
>> abbreviated as "OntoLex-Lemon"?
>>
>> Indeed, I propose we use the acronym lemon in the document, but in the
>> introduction we should have the long name. I have fixed this already.
>>
>>
>> 2. ontolex/example12 is very difficult to understand now that we have
>> named this property "context" and not "usage". The idea that "riviere" can
>> be extended with a usage note "A riviere is a river that flows into the
>> sea" makes sense but it is not clear why the usage note is called a
>> "context"... we need to either clearly justify this or rename the property
>> to "usage". I prefer the latter option. (see also point 28)
>>
>>
>> True, I propose to move this example down where we discuss the usage
>> property.
>>
> There is no "usage" property, we renamed it to "context".
>
> After having a look at some bibliography on lexicography, we also agree on
> renaming the property "context" to "usage". It is considered general enough
> to refer to several types of "conditions" in which the use of a certain
> term is justified (context, domain, style, register, meaning nuances,
> connotations, etc.)
>
I agree. Philipp & everyone should we make this change?

>
>
>>
>> 3. The vartrans:category "property indicates the specific type of a
>> relation", we already have a property to do this namely rdf:type! It is
>> not clear to me from the text why we need to redefine this property. (i.e.,
>> either we need to better justify this or drop this property)
>>
>> No clear opinion about this yet.
>>
> The *category* property indicates the specific type of relation by which
> two lexical entries or two lexical senses are related.
> Indeed, the definition may seem a bit general. However, the rdf:type
> property seems to us as"too underspecified" (and, therefore, not worthy of
> being included in the vartrans module...) and maybe not familiar to the
> linguistic community.
> We propose to slightly modify the definition as "The *category* property
> indicates the specific type of *lexico-semantic relation* by which two
> lexical entries or two lexical senses are related"
> And add an explanation in this line: This property is meant to capture
> different lexical and semantic relations of the sort: initialism,
> ortographic variant, dialectal or geographic variant, register variant,
> chronological variant, stylistic variant, dimensional variant, synonymy,
> antonymy, or translation. A set of lexico-semantic relations are available
> in the lexinfo vocabulary.
> (A nice list of these types of variation and translation relations was
> included some time ago at:
> http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Properties-and-Relations-of-Entries
> )
>
> Finally, ObjectProperty: Category, should be in small letters, right?
>
The advantage of rdf:type is that we have normal ontology reasoning. For
example in WordNet we have not just meronyms, but 'part', 'substance' and
'member' meronyms so with rdf:type from the following

:myLSR              rdf:type        wordnet:PartMeronym .
wordnet:PartMeronym rdfs:subClassOf lexinfo:Meronym .
lexinfo:Meronym     rdfs:subClassOf vartrans:SenseRelation.

Then from this we can infer that myLSR is a meronym and a sense relation.
If we introduce a category property then it is very difficult to create a
hierarchy of LSRs, right?


>
>

>
>> 4. Lime defines a number of properties that are of the form "the number
>> of links from X to Y divided by the total number of X" for example
>> lime:avgNumOfLexicalizations is "the number of links from references to
>> lexical entries divided by the total number of references". This can be put
>> into a table as follows:
>>
>> X/Y References Entries Concepts References - avgNumOfLexicalizations
>> avgNumOfLinks Entries percentage - avgAmbiguity Concepts ? avgSynonymy -
>>
>>
>> The table reveals a few inconsistencies in that we have a missing
>> property and the percentage property should perhaps be named something like
>> avgPolysemy
>>
>> 5. As the NIF "community" has not responded to our questions, we are
>> forced to drop recommendations of linking using NIF, and instead only
>> recommend OpenAnnotation.
>>
>>
>> Not sure yet.
>>
> We wouldn't be so sure of leaving NIF out. It is quite well-known in the
> community, don't you think so?
>
I have only been told that the modelling I currently have is 'probably
wrong'. If there is no action from this 'community' we cannot include
anything.

>
>
>> *Not-so-important points*
>>
>> (JPM) means I will try and fix them within the next two weeks
>>
>> 6. "Document is structured into eight sections" only there are nine (JPM)
>>
>>
>> Yes.
>>
>>
>> 7. The first paragraph of the introduction is very academic, perhaps it
>> could be rewritten to be more appealing to a general audience. (JPM)
>>
>>
>> I am not sure about the "academic", but I am ok if you work on it.
>>
>>
>> 8. "sublcass" and a number of other basic spelling errors exist
>> throughout the document. We must spell-check the document! (JPM)
>>
>>
>> Yes. I spotted some of those already today while doing a first pass over
>> the document.
>>
>
>> 9. ontolex/example4 uses "/" around the IPA representations of the terms..
>> I don't think that this is necessary. We should also explain the language
>> tag and reference the IANA subtag catalogue.
>>
>>
>> OK, can you please look into this.
>>
>>
>> 10. There is little consistency about whether we write "lexical entry" or
>> "LexicalEntry" or use a fixed-width font. (JPM I prefer the real English
>> 'lexical entry')
>>
>> Yes, we should use small case here, that is 'lexical entry'.
>>
>> 11. Similarly we should check that terms like "rdfs:label" are always
>> fixed-width (JPM)
>>
>> ok
>>
>> 12. "with canonical form the noun" !? (JPM)
>>
>> fixed
>>
>> 13. ontolex/example6 seems to duplicate ontolex/example1
>>
>>
>> Not really. Becasue in example1 we did not have the writtenRep etc. So
>> this example is incremental. I think it is fine.
>>
> "Lexical entries are further specialized into words, affixes (e.g.,
> suffix, prefix, infix or circumfix) and multiword expressions." then
> ontolex/example1
>
> "Of course, lexical entries need not to correspond to one word only, they
> can correspond to a multi-word term, as the following example for the
> lexical entry "intangible assets" shows:" then ontolex/example6
>
> ontolex/example6 seems to repeat the point and it is not clear why it
> does, could you revise the text before ontolex/example6?
>
>>
>>
>> 14. We need an example showing how we represent abbreviations relative to
>> their full forms (JPM)
>>
>>
>> True, can you add one example...
>>
>>
>> 15. In the definition of "other form" we should probably not say
>> "non-dictionary" but "non-lemma". (JPM)
>>
>>
>> Yes, agreed.
>>
> We would rather say "non-lemmatic form".
>
I don't think that is actually an English word... it is not in the OED,
Wiktionary or my spellchecker.

>
>> 16. ontolex/example10 is still not good. The "bank" part of the example
>> makes no sense as it is two separate entries with separate meanings, but it
>> is not well explained why "bank" is two entries. The second part of this
>> example uses the word "apothecary", which is a highly unusual word in
>> English and I would not (personally) say is truly synonymous with
>> "pharmacist". I had suggested using "troll" as the example here, but that
>> seems not to have been adopted. Perhaps we also need a separate example
>> explaining "bank" here too? (JPM)
>>
>> I think the example is fine. Why does "bank" make no sense? The example
>> gives guidance to people about how to model multiple meanings of a word.
>>
> We don't explain why "bank" is two lexical entries and
> "apothecary"/"troll" is one.
>
>> The case of bank shows the case where there are two different entries for
>> the word and both the lexical entries and the meanings are unrelated.
>> The case of "apothecary" is the other case in which there is one lexical
>> entry with two meanings.
>>
>> I am fine though if we replace the "apothecary" example by the "troll
>> example".
>>
>> It seems that both meanings are indeed in DBpedia:
>>
>> http://de.dbpedia.org/page/Troll_(Mythologie)
>> https://de.wikipedia.org/wiki/Troll_(Netzkultur)
>> Ok then.
>>
> We think that it would be clearer if we divide the example into two
> separated examples.
>
Yes I do too

> As for the explanation included below the example, and I quote: "In the
> above example, two lexical entries have been used for *bank*. The reason
> is that in this case both words *bank* are actually not grammatically
> related and thus represent two independent lexical entries with meanings
> that are not related", we are of the opinion that the statement "are
> actually not grammatically related" is unnecessay, since morphologically
> they have the same sequence of letterns and are both nouns. Moreover, in a
> dictionary the entry would be the same. So we propose to simply remove "are
> actually not grammatically related and thus".
>
> 17. ontolex/example12 is listed in the text as synsem/example12! (JPM)
>>
>>
>> ok.
>>
>>
>> 18. Terms like 'Lexicon' and 'Lexical Entry' should not be capitalized
>> they are not proper nouns (JPM)
>>
>>
>> Yes.
>>
>>
>> 19. The lexical concept can be better explained as follows: The reference
>> in the ontology primarily gives an interpretation of a word in terms of the
>> identifiers that would be generated by the semantic parsing of the
>> sentence. For example if we were to understand the query "when did John
>> Lennon die?" we may understand the word "die" as generating the URI
>> dbpedia:deathDate within a SPARQL query. In contrast many resources will
>> also wish to record the intentional meaning of the word with the mental
>> lexicon, such as "die" referring to the concept of death, for this reason
>> we introduce the class lexical concept which can be evoked by a lexical
>> entry in place of or as well as a denotation in the ontology, e.g.,
>>
>>    :die a ontolex:Word ;
>>      ontolex:denotes dbpedia:deathDate ;
>>      ontolex:evokes  wordnet:Dying .
>> (JPM)
>>
>>
>> OK, but I would add this in addition to the explanation we have as an
>> elaboration. I like the way you have phrased this.
>>
> We agree with adding this as an explanation, but not modifying the
> definition.
>
I don't think I was proposing to modify the definition.

>
>> 20. Capitalization in definition of OntoMap is wrong. (JPM)
>>
>>
>> Why is it wrong?
>>
>>
>> 21. I don't like the paragraph 'An OntoMap resembles the
>> SynSemCorrespondence...' as
>> The OntoMap does not really resemble synsemcorrespondence
>> I don't think we should compare to a closed standard like LMF that is
>> unfamiliar to most of our audience
>> Talking about semantic arguments will only create more confusion
>>
>>
>> Well, this is a major issue that I will bring up soon. I indeed see the
>> OntoMap as the ontolex counterpart to the SynSemCorrespondence. In fact, I
>> will argue not to regard OntoMap as a subclass of Lexical Sense. But let us
>> not open this box today... ;-)
>>
> I thought (actually hoped) this was closed too.
>
>>
>>
>> 22. All "dbpedia:" URIs should be fixed width (JPM)
>>
>>
>> This point is not clear to me, sorry.
>>
> Anything staring "dbpedia:" should be in fixed width
>
>>
>>
>> 23. Some examples use "dbonto" and some "dbpedia"... inconsistent. (JPM)
>>
>>
>> Well, there are different namespaces in DBpedia as well. Should we be
>> more consistent that DBpedia? We could try to stick to the ontology
>> namespace however...
>>
> We should be consistent, dbpedia: sometimes is short for
> http://dbpedia.org/ontology/ and sometimes for
> http://dbpedia.org/resource/ and sometimes should be short for
> <http://dbpedia.org/property/>http://dbpedia.org/property/ but isn't
>
>>
>>
>> 24. "The verb (to) launch" needs quotation marks (JPM)
>>
>>
>> OK
>>
>>
>> 25. "Complex ontology mappings / submappings" talks about semantic
>> arguments but this is confusing
>>
>>
>> Not sure why this is confusing. I still see the subject and object
>> position of a triple as arguments of the triple. Maybe the term "semantic"
>> is confusing here?
>>
> I want to remove any discussion of "semantic arguments" from this spec,
> these will be introduced in a future module (per the most recent
> agreement).
>
>>
>>
>> 26. Indentation of synsem/example8 needs to be fixed (JPM)
>>
>>
>> OK
>>
>>
>> 27. "If element x decides if x"... this is not a maths paper, use
>> English. (JPM)
>>
>> This comes from me. I though this makes it clear that with isA we refer
>> to the lambda-abstracted variable of a lambda expression or to the argument
>> of a function that characterizes the set. I find this quite clear and think
>> that it is understandable as such. But we can add an English sentence that
>> clarifies this a bit.
>>
>>
>> 28. condition is defined as a subproperty of usage (JPM, see point 2)
>>
>> 29. "not found in many other languages" => "not found in some other
>> languages and more importantly in some ontologies" (JPM)
>>
>>
>> ok
>>
>>
>> 30. I am not sure from a linguistic point of view that it is correct to
>> say that "otitis" is composed of the affix "itis" in decomp/example3. In
>> particular there is no Spanish word "ot" and "-itis" is a Greek inflection
>> not a true suffix. An easier example would be with a normal prefix such as
>> "un-", "re-" or "dis-"...
>>
>> Well, it is. It is clearer if we use the term "apendicitis" In which
>> "itis" again means inflammation. "apendic" stands for appendix. Is that
>> better?
>>
> "Appendic" is still not a word... we could choose an example which is
> clearer, e.g., "un-"...
>
> Appendic is not a word, but appendix + itis, or apéndice + itis (and the e
> is dropped)
> The suffix is added to the root, as in the case of "ot" + "itis"
> otos means "related to the ear" (referido al oído)
>
'otos' are birds (bustards to be precise) according to my Spanish
dictionary...

My point was to say that there are better examples

>
>
>
>>
>>
>> 31. It appears that order information has been added to
>> decomp/example6... this is not necessary if we know that order of the words
>> from the main entry and this representation actually saves a triple (ergo
>> IMHO is superior!)
>>
>>     :AfricanSwineFever a ontolex:MultiwordExpression ;
>>       rdf:_1 African_node ;
>>       rdf:_2 Swine_node ;
>>       rdf:_3 Fever_node .
>>
>>
>> It does not hurt to add this information. Because the order is only
>> implicit in the lexical entry. One would need to tokenize the lexical entry
>> to get the order... Saving triples is not always good if one looses
>> information that needs to be recovered...
>>
> Either way you have to recover some information. If you keep the example
> as is then the tokenization of the lexical entry needs to be recovered, if
> you switch to my model the parse order needs to be recovered, but
> tokenization is more useful and efficient to represent.
>
> Why would it be more useful and efficient? Could you explain this?
>

Efficient can be easily explained:

:AfricanSwineFever a ontolex:MultiwordExpression .

:AfricanSwineFever_root a decomp:Component ;
  decomp:correspondsTo :AfricanSwineFever ;
  decomp:constituent :African_node, :SwineFever_node ;
  rdf:_1 :African_node;
  rdf:_2 :SwineFever_node;
  olia:hasTag penn:NP .

:African_node a decomp:Component ;
  decomp:correspondsTo :African ;
  olia:hasTag penn:JJ .

:SwineFever_node a decomp:Component ;
  decomp:constituent :Swine_node, :Fever_node ;
  rdf:_1 Swine_node;
  rdf:_2 Fever_node;
  olia:hasTag penn:NP .

:Swine_node a decomp:Component ;
  decomp:correspondsTo :Swine ;
  olia:hasTag penn:NN .

:Fever_node a decomp:Component ;
  decomp:correpondsTo :Fever ;
  olia:hasTag penn:NN .
Triple count: 23

:AfricanSwineFever a ontolex:MultiwordExpression ;
      rdf:_1 African_node ;
      rdf:_2 Swine_node ;
      rdf:_3 Fever_node .

:AfricanSwineFever_root a decomp:Component ;
  decomp:correspondsTo :AfricanSwineFever ;
  decomp:constituent :African_node, :SwineFever_node ;
  olia:hasTag penn:NP .

:African_node a decomp:Component ;
  decomp:correspondsTo :African ;
  olia:hasTag penn:JJ .

:SwineFever_node a decomp:Component ;
  decomp:constituent :Swine_node, :Fever_node ;
  olia:hasTag penn:NP .

:Swine_node a decomp:Component ;
  decomp:correspondsTo :Swine ;
  olia:hasTag penn:NN .

:Fever_node a decomp:Component ;
  decomp:correpondsTo :Fever ;
  olia:hasTag penn:NN .
Triple count: 22

'Useful' is a bit harder but in my carrier I have tokenized lots of things
but not parsed anything like as many, ergo I think tokenization is more
useful.

>
>
>>
>> 32. "adjective -> adverb variation" not sure what "minus greater than"
>> means here. (JPM change to arrow)
>>
>> 33. "Translation" section lists the "following ways [of representing
>> translation] of increasing ontological strength"... but they are clearly
>> not increasing! I am not really sure what ontological strength means.
>>
>>
>> This comes from me. I will revise it.
>>
>>
>> 34. The diagram for lime metadata needs to be updated. (JPM)
>>
>> 35. lime/example2 "jnp" => "jpn" (JPM)
>>
>> 36. I have a comment on "Verb form mood" that appears to never have been
>> answered. I assume that my merge has no objections. (JPM)
>>
>> Regards,
>> John
>>
>>
> Some more spotted misspellings and stylistic nuances:
>
> *Domain:* LexicalSense
>
> *Range:* rdfs:Ressource
>
> The combined usage of the properties denotes, sense, evokes, concept and
> lexicalized sense is demonstrated in the example below for the case of a
> lexical resource such as WordNet.
>
> OntoLex/Lemon has a much simpler usage, removing many elements that were
> in LMF
>
> The following example gives an example of a sense relation:
>
> Proposal:
>
> The following example illustrates a sense relation:
>
> The following example shows how to model the relation between "Food and
> Agriculture Organisation" and its initialism "FAO" as one example of a
> lexical relation
>
> Proposal:
> The following example shows how t*o model the lexical relation between*
> "Food and Agriculture Organisation" and its initialism "FAO"
>
> In the introductory paragraph to Syntactic Frames, we think it should be:
> stand *on *their own, and not *by *their own
> In the definition of Syntactic Frame, the definite article "the" is
> missing in "in terms of the (syntactic) arguments"
>
> A comma is missing in the sentence below "... the preposition in*,* ..."
> The following example shows how to specify that the intransitive verb
> *operate*, subcategorizing a prepositional phrase introduced by the
> preposition *in* can be used to denote the propertyregionServed
> <http://dbpedia.org/ontology/regionServed> in DBpedia
>
> In the next sentence, examples should be in singular:
> The following example*s* shows how to use the submap
> <http://www.w3.org/community/ontolex/wiki/Final_Model_Specification#Submap>
>  property to indicate that the meaning of the phrase "X launched Y in Z"
> is a composition of the properties dbpedia:product
> <http://dbpedia.org/ontology/product> anddbpedia:productionStartYear
> <http://dbpedia.org/ontology/productionStartYear>, which together express
> the meaning of the syntactic frame
>
> In the following sentence below example 6, quantifier should be quantifying
> "Indicating that an argument is optional means that it does not have to
> be realized syntactically in which case from a semantic point of view the
> corresponding semantic argument is existentially quantifier over."
>
> In the definition of Optional, we would avoid the use of "optional" in the
> explanation, and say instead: The optional property indicates that a
> syntactic argument can be omitted.
> The *optional* property indicates whether a syntactic argument is
> optional, that is, it can be syntactically omitted.
>
> In example 7 (Optional): a slash is missing, see:
>
>  ontolex:reference <http:/ontology.org/giving> <http:/ontology.org/giving>;
>
> BTW, is http://ontology.org/giving correct???
>
> In example 9 there is a mispelling in Transportation, see:
>
> :methodOfTransporation a rdf:Property ;
>
> Is this example complete? shouldn't it be pointing to an ontology??
>
> Below example decomp/example 2
> Revise the following sentence (verb is at the end...)
> "It is important to note that the subterm property does not indicate the
> position or even which words a subterm is."
>
>
>
> Finally, we see that sometimes the names of classes or properies have
> hyperlinks, but not always. Which should be the criterion to follow?
> See for example the paragraph below in which regionServed is sometimes
> hyperlinked, others highlighted in bold, or not highlighted at all (
> dbpedia:regionServed).
>
> "The following example shows how to specify that the intransitive verb
> *operate*, subcategorizing a prepositional phrase introduced by the
> preposition *in* can be used to denote the propertyregionServed
> <http://dbpedia.org/ontology/regionServed> in DBpedia. The entry
> specifies that in a construction such as `X operates in Y', the X refers to
> the subject of the property dbpedia:regionServed, and the Y refers to the
> object of the property *regionServed*. Again, we use the LexInfo
> <http://www.lexinfo.net/> ontology in our example to provide linguistic
> information:"
>
>
>> --
>> --
>> Prof. Dr. Philipp Cimiano
>> AG Semantic Computing
>> Exzellenzcluster für Cognitive Interaction Technology (CITEC)
>> Universität Bielefeld
>>
>> Tel: +49 521 106 12249
>> Fax: +49 521 106 6560
>> Mail: cimiano@cit-ec.uni-bielefeld.de
>>
>> Office CITEC-2.307
>> Universitätsstr. 21-25
>> 33615 Bielefeld, NRW
>> Germany
>>
>>
>
>
> --
> Elena Montiel-Ponsoda
> Ontology Engineering Group (OEG)
> Departamento de Inteligencia Artificial
> ETS de Ingenieros Informáticos
> Campus de Montegancedo s/n
> Boadilla del Monte-28660 Madrid, Españawww.oeg-upm.net
> Tel. (+34) 91 336 36 70
> Fax  (+34) 91 352 48 19
>
>
Received on Friday, 4 September 2015 09:30:55 UTC