- From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
- Date: Tue, 8 Sep 2015 08:35:16 +0200
- To: public-ontolex@w3.org
- Message-ID: <55EE81A4.2030309@cit-ec.uni-bielefeld.de>
Dear all, two things: 1) Elena/Lupe: can you please directly implement the stylistic changes you proposed in the document. That would help a lot, thanks! 2) John: on economy of representation and the tokenization/parsing issue: I finally got your point ;-) I do not think we should prefer one over the other. Both making tokenization and phrase structure explicit is important and has use cases. Further, we can not really prevent people from using the RDF ordering predicates as they wish anyway. My proposal would be to leave the level of representation needed to the needs of a user. I would split the example in two: the first one only indicates tokenization and decomposition. The second example shows how to additionally indicate the phrase structure. It is true that the order of nodes in the phrase structure can always be reconstructed from the order of tokens. However, this requires some computation to reconstruct the order of nodes in a tree. Some people might want to make this explicit instead of having to recompute the order. In case of non-terminal nodes the order needs to be recomputed from the order of terminal nodes. Some people might find that unhandy. So there is nothing that speaks against adding the order information to non-terminal nodes as well if people wish to do so. We can not prevent it anway. So I propose to add a further example to show how this can be done... Will try to add something along these lines... Philipp. Am 04.09.15 um 11:28 schrieb John McCrae: > > > On Wed, Sep 2, 2015 at 2:45 PM, Elena Montiel Ponsoda > <elemontiel@gmail.com <mailto:elemontiel@gmail.com>> wrote: > > Dear all, > > Hope you all had a great holiday time. > Thanks for this summary. Between lines, some comments by Lupe and > myself. > > Best, > Elena > > El 26/08/2015 a las 17:51, John McCrae escribió: >> >> >> On Wed, Aug 26, 2015 at 2:16 PM, Philipp Cimiano >> <cimiano@cit-ec.uni-bielefeld.de >> <mailto:cimiano@cit-ec.uni-bielefeld.de>> wrote: >> >> Hi John, >> >> thanks for the summary of open issues. I comment on them.... >> >> Am 24.07.15 um 13:37 schrieb John P. McCrae: >>> Hi all, >>> >>> I made a thorough read-through of the specification and have >>> some comments. There are five points that may be >>> controversial and another /few/ that should not be. >>> >>> *Important points* >>> >>> 1. We do not given the abbreviation of "lexicon model for >>> ontologies" as "lemon" although the term lemon is used at >>> several points in the document. Do we agree that the model >>> is called "lexicon model for ontologies" and abbreviated as >>> "OntoLex-Lemon"? >> Indeed, I propose we use the acronym lemon in the document, >> but in the introduction we should have the long name. I have >> fixed this already. >> >>> >>> 2. ontolex/example12 is very difficult to understand now >>> that we have named this property "context" and not "usage". >>> The idea that "riviere" can be extended with a usage note "A >>> riviere is a river that flows into the sea" makes sense but >>> it is not clear why the usage note is called a "context"... >>> we need to either clearly justify this or rename the >>> property to "usage". I prefer the latter option. (see also >>> point 28) >> >> True, I propose to move this example down where we discuss >> the usage property. >> >> There is no "usage" property, we renamed it to "context". > After having a look at some bibliography on lexicography, we also > agree on renaming the property "context" to "usage".. It is > considered general enough to refer to several types of > "conditions" in which the use of a certain term is justified > (context, domain, style, register, meaning nuances, connotations, > etc.) > > I agree. Philipp & everyone should we make this change? > > >> >>> >>> 3. The vartrans:category "property indicates the specific >>> type of a relation", we already have a property to do this >>> namely rdf:type! It is not clear to me from the text why we >>> need to redefine this property. (i.e., either we need to >>> better justify this or drop this property) >> No clear opinion about this yet. >> > The*category*property indicates the specific type of relation by > which two lexical entries or two lexical senses are related. > Indeed, the definition may seem a bit general. However, the > rdf:type property seems to us as"too underspecified" (and, > therefore, not worthy of being included in the vartrans module...) > and maybe not familiar to the linguistic community. > We propose to slightly modify the definition as > "The*category*property indicates the specific type of > *lexico-semantic relation* by which two lexical entries or two > lexical senses are related" > And add an explanation in this line: This property is meant to > capture different lexical and semantic relations of the sort: > initialism, ortographic variant, dialectal or geographic variant, > register variant, chronological variant, stylistic variant, > dimensional variant, synonymy, antonymy, or translation. A set of > lexico-semantic relations are available in the lexinfo vocabulary. > (A nice list of these types of variation and translation relations > was included some time ago at: > http://www.w3..org/community/ontolex/wiki/Specification_of_Requirements/Properties-and-Relations-of-Entries > <http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Properties-and-Relations-of-Entries>) > > Finally, ObjectProperty: Category, should be in small letters, right? > > The advantage of rdf:type is that we have normal ontology reasoning. > For example in WordNet we have not just meronyms, but 'part', > 'substance' and 'member' meronyms so with rdf:type from the following > > :myLSR rdf:type wordnet:PartMeronym . > wordnet:PartMeronym rdfs:subClassOf lexinfo:Meronym . > lexinfo:Meronym rdfs:subClassOf vartrans:SenseRelation. > > Then from this we can infer that myLSR is a meronym and a sense > relation. If we introduce a category property then it is very > difficult to create a hierarchy of LSRs, right? > > >>> >>> 4. Lime defines a number of properties that are of the form >>> "the number of links from X to Y divided by the total number >>> of X" for example lime:avgNumOfLexicalizations is "the >>> number of links from references to lexical entries divided >>> by the total number of references". This can be put into a >>> table as follows: >>> >>> X/Y References Entries Concepts >>> References - |avgNumOfLexicalizations| |avgNumOfLinks| >>> Entries |percentage| - |avgAmbiguity| >>> Concepts ? |avgSynonymy| - >>> >>> >>> The table reveals a few inconsistencies in that we have a >>> missing property and the percentage property should perhaps >>> be named something like avgPolysemy >>> >>> 5. As the NIF "community" has not responded to our >>> questions, we are forced to drop recommendations of linking >>> using NIF, and instead only recommend OpenAnnotation. >> >> Not sure yet. >> > We wouldn't be so sure of leaving NIF out. It is quite well-known > in the community, don't you think so? > > I have only been told that the modelling I currently have is 'probably > wrong'. If there is no action from this 'community' we cannot include > anything. > > >>> >>> *Not-so-important points* >>> >>> (JPM) means I will try and fix them within the next two weeks >>> >>> 6. "Document is structured into eight sections" only there >>> are nine (JPM) >> >> Yes. >>> >>> 7. The first paragraph of the introduction is very academic, >>> perhaps it could be rewritten to be more appealing to a >>> general audience. (JPM) >> >> I am not sure about the "academic", but I am ok if you work >> on it. >>> >>> 8. "sublcass" and a number of other basic spelling errors >>> exist throughout the document. We must spell-check the >>> document! (JPM) >> >> Yes. I spotted some of those already today while doing a >> first pass over the document. >> >>> >>> 9. ontolex/example4 uses "/" around the IPA representations >>> of the terms. I don't think that this is necessary. We >>> should also explain the language tag and reference the IANA >>> subtag catalogue. >> >> OK, can you please look into this. >>> >>> 10. There is little consistency about whether we write >>> "lexical entry" or "LexicalEntry" or use a fixed-width font. >>> (JPM I prefer the real English 'lexical entry') >>> >> Yes, we should use small case here, that is 'lexical entry'. >> >>> 11. Similarly we should check that terms like "rdfs:label" >>> are always fixed-width (JPM) >>> >> ok >> >>> 12. "with canonical form the noun" !? (JPM) >>> >> fixed >> >>> 13. ontolex/example6 seems to duplicate ontolex/example1 >> >> Not really. Becasue in example1 we did not have the >> writtenRep etc. So this example is incremental. I think it is >> fine. >> >> "Lexical entries are further specialized into words, affixes >> (e.g., suffix, prefix, infix or circumfix) and multiword >> expressions." then ontolex/example1 >> >> "Of course, lexical entries need not to correspond to one word >> only, they can correspond to a multi-word term, as the following >> example for the lexical entry "intangible assets" shows:" then >> ontolex/example6 >> ontolex/example6 seems to repeat the point and it is not clear >> why it does, could you revise the text before ontolex/example6? >> >> >>> >>> 14. We need an example showing how we represent >>> abbreviations relative to their full forms (JPM) >> >> True, can you add one example... >>> >>> 15. In the definition of "other form" we should probably not >>> say "non-dictionary" but "non-lemma". (JPM) >> >> Yes, agreed. >> > We would rather say "non-lemmatic form". > > I don't think that is actually an English word... it is not in the > OED, Wiktionary or my spellchecker. > >>> >>> 16. ontolex/example10 is still not good. The "bank" part of >>> the example makes no sense as it is two separate entries >>> with separate meanings, but it is not well explained why >>> "bank" is two entries. The second part of this example uses >>> the word "apothecary", which is a highly unusual word in >>> English and I would not (personally) say is truly synonymous >>> with "pharmacist". I had suggested using "troll" as the >>> example here, but that seems not to have been adopted. >>> Perhaps we also need a separate example explaining "bank" >>> here too? (JPM) >> I think the example is fine. Why does "bank" make no sense? >> The example gives guidance to people about how to model >> multiple meanings of a word. >> >> We don't explain why "bank" is two lexical entries and >> "apothecary"/"troll" is one. >> >> The case of bank shows the case where there are two different >> entries for the word and both the lexical entries and the >> meanings are unrelated. >> The case of "apothecary" is the other case in which there is >> one lexical entry with two meanings. >> >> I am fine though if we replace the "apothecary" example by >> the "troll example". >> >> It seems that both meanings are indeed in DBpedia: >> >> http://de.dbpedia.org/page/Troll_(Mythologie) >> <http://de.dbpedia.org/page/Troll_%28Mythologie%29> >> https://de.wikipedia.org/wiki/Troll_(Netzkultur) >> <https://de.wikipedia.org/wiki/Troll_%28Netzkultur%29> >> Ok then. >> > We think that it would be clearer if we divide the example into > two separated examples. > > Yes I do too > > As for the explanation included below the example, and I quote: > "In the above example, two lexical entries have been used > for/bank/. The reason is that in this case both words/bank/are > actually not grammatically related and thus represent two > independent lexical entries with meanings that are not related", > we are of the opinion that the statement "are actually not > grammatically related" is unnecessay, since morphologically they > have the same sequence of letterns and are both nouns. Moreover, > in a dictionary the entry would be the same. So we propose to > simply remove "are actually not grammatically related and thus". >> >>> 17. ontolex/example12 is listed in the text as >>> synsem/example12! (JPM) >> >> ok. >>> >>> 18. Terms like 'Lexicon' and 'Lexical Entry' should not be >>> capitalized they are not proper nouns (JPM) >> >> Yes. >>> >>> 19. The lexical concept can be better explained as follows: >>> The reference in the ontology primarily gives an >>> interpretation of a word in terms of the identifiers that >>> would be generated by the semantic parsing of the sentence. >>> For example if we were to understand the query "when did >>> John Lennon die?" we may understand the word "die" as >>> generating the URI dbpedia:deathDate within a SPARQL query. >>> In contrast many resources will also wish to record the >>> intentional meaning of the word with the mental lexicon, >>> such as "die" referring to the concept of death, for this >>> reason we introduce the class lexical concept which can be >>> evoked by a lexical entry in place of or as well as a >>> denotation in the ontology, e.g., >>> >>> :die a ontolex:Word ; >>> ontolex:denotes dbpedia:deathDate ; >>> ontolex:evokes wordnet:Dying . >>> (JPM) >> >> OK, but I would add this in addition to the explanation we >> have as an elaboration. I like the way you have phrased this. >> > We agree with adding this as an explanation, but not modifying the > definition. > > I don't think I was proposing to modify the definition. > >>> >>> 20. Capitalization in definition of OntoMap is wrong. (JPM) >> >> Why is it wrong? >>> >>> 21. I don't like the paragraph 'An OntoMap resembles the >>> SynSemCorrespondence...' as >>> The OntoMap does not really resemble synsemcorrespondence >>> I don't think we should compare to a closed standard like >>> LMF that is unfamiliar to most of our audience >>> Talking about semantic arguments will only create more confusion >> >> Well, this is a major issue that I will bring up soon. I >> indeed see the OntoMap as the ontolex counterpart to the >> SynSemCorrespondence. In fact, I will argue not to regard >> OntoMap as a subclass of Lexical Sense. But let us not open >> this box today... ;-) >> >> I thought (actually hoped) this was closed too. >> >> >>> >>> 22. All "dbpedia:" URIs should be fixed width (JPM) >> >> This point is not clear to me, sorry. >> >> Anything staring "dbpedia:" should be in fixed width >> >> >>> >>> 23. Some examples use "dbonto" and some "dbpedia"... >>> inconsistent. (JPM) >> >> Well, there are different namespaces in DBpedia as well. >> Should we be more consistent that DBpedia? We could try to >> stick to the ontology namespace however... >> >> We should be consistent, dbpedia: sometimes is short for >> http://dbpedia.org/ontology/ and sometimes for >> http://dbpedia.org/resource/ and sometimes should be short for >> http://dbpedia.org/property/ but isn't >> >> >>> >>> 24. "The verb (to) launch" needs quotation marks (JPM) >> >> OK >>> >>> 25. "Complex ontology mappings / submappings" talks about >>> semantic arguments but this is confusing >> >> Not sure why this is confusing. I still see the subject and >> object position of a triple as arguments of the triple. Maybe >> the term "semantic" is confusing here? >> >> I want to remove any discussion of "semantic arguments" from this >> spec, these will be introduced in a future module (per the most >> recent agreement). >> >> >>> >>> 26. Indentation of synsem/example8 needs to be fixed (JPM) >> >> OK >>> >>> 27. "If element x decides if x"... this is not a maths >>> paper, use English. (JPM) >> This comes from me. I though this makes it clear that with >> isA we refer to the lambda-abstracted variable of a lambda >> expression or to the argument of a function that >> characterizes the set. I find this quite clear and think that >> it is understandable as such. But we can add an English >> sentence that clarifies this a bit. >> >>> >>> 28. condition is defined as a subproperty of usage (JPM, see >>> point 2) >>> >>> 29. "not found in many other languages" => "not found in >>> some other languages and more importantly in some >>> ontologies" (JPM) >> >> ok >>> >>> 30. I am not sure from a linguistic point of view that it is >>> correct to say that "otitis" is composed of the affix "itis" >>> in decomp/example3. In particular there is no Spanish word >>> "ot" and "-itis" is a Greek inflection not a true suffix. An >>> easier example would be with a normal prefix such as "un-", >>> "re-" or "dis-"... >> Well, it is. It is clearer if we use the term "apendicitis" >> In which "itis" again means inflammation. "apendic" stands >> for appendix. Is that better? >> >> "Appendic" is still not a word... we could choose an example >> which is clearer, e.g., "un-"... > Appendic is not a word, but appendix + itis, or apéndice + itis > (and the e is dropped) > The suffix is added to the root, as in the case of "ot" + "itis" > otos means "related to the ear" (referido al oído) > > 'otos' are birds (bustards to be precise) according to my Spanish > dictionary... > > My point was to say that there are better examples > > > >> >> >>> >>> 31. It appears that order information has been added to >>> decomp/example6... this is not necessary if we know that >>> order of the words from the main entry and this >>> representation actually saves a triple (ergo IMHO is superior!) >>> >>> :AfricanSwineFever a ontolex:MultiwordExpression ; >>> rdf:_1 African_node ; >>> rdf:_2 Swine_node ; >>> rdf:_3 Fever_node . >> >> It does not hurt to add this information. Because the order >> is only implicit in the lexical entry. One would need to >> tokenize the lexical entry to get the order... Saving triples >> is not always good if one looses information that needs to be >> recovered... >> >> Either way you have to recover some information. If you keep the >> example as is then the tokenization of the lexical entry needs to >> be recovered, if you switch to my model the parse order needs to >> be recovered, but tokenization is more useful and efficient to >> represent. > Why would it be more useful and efficient? Could you explain this? > > > Efficient can be easily explained: > > :AfricanSwineFever a ontolex:MultiwordExpression . > > :AfricanSwineFever_root a decomp:Component ; > decomp:correspondsTo :AfricanSwineFever ; > decomp:constituent :African_node, :SwineFever_node ; > rdf:_1 :African_node; > rdf:_2 :SwineFever_node; > olia:hasTag penn:NP . > > :African_node a decomp:Component ; > decomp:correspondsTo :African ; > olia:hasTag penn:JJ . > > :SwineFever_node a decomp:Component ; > decomp:constituent :Swine_node, :Fever_node ; > rdf:_1 Swine_node; > rdf:_2 Fever_node; > olia:hasTag penn:NP . > > :Swine_node a decomp:Component ; > decomp:correspondsTo :Swine ; > olia:hasTag penn:NN . > > :Fever_node a decomp:Component ; > decomp:correpondsTo :Fever ; > olia:hasTag penn:NN . > Triple count: 23 > > :AfricanSwineFever a ontolex:MultiwordExpression ; > rdf:_1 African_node ; > rdf:_2 Swine_node ; > rdf:_3 Fever_node . > > :AfricanSwineFever_root a decomp:Component ; > decomp:correspondsTo :AfricanSwineFever ; > decomp:constituent :African_node, :SwineFever_node ; > olia:hasTag penn:NP . > > :African_node a decomp:Component ; > decomp:correspondsTo :African ; > olia:hasTag penn:JJ . > > :SwineFever_node a decomp:Component ; > decomp:constituent :Swine_node, :Fever_node ; > olia:hasTag penn:NP . > > :Swine_node a decomp:Component ; > decomp:correspondsTo :Swine ; > olia:hasTag penn:NN . > > :Fever_node a decomp:Component ; > decomp:correpondsTo :Fever ; > olia:hasTag penn:NN . > Triple count: 22 > > 'Useful' is a bit harder but in my carrier I have tokenized lots of > things but not parsed anything like as many, ergo I think tokenization > is more useful. > > >> >>> >>> 32. "adjective -> adverb variation" not sure what "minus >>> greater than" means here. (JPM change to arrow) >>> >>> 33. "Translation" section lists the "following ways [of >>> representing translation] of increasing ontological >>> strength"... but they are clearly not increasing! I am not >>> really sure what ontological strength means. >> >> This comes from me. I will revise it. >>> >>> 34. The diagram for lime metadata needs to be updated. (JPM) >>> >>> 35. lime/example2 "jnp" => "jpn" (JPM) >>> >>> 36. I have a comment on "Verb form mood" that appears to >>> never have been answered. I assume that my merge has no >>> objections. (JPM) >>> >>> Regards, >>> John >> > > Some more spotted misspellings and stylistic nuances: > > *Domain:*LexicalSense > > *Range:*rdfs:Ressource > > The combined usage of the properties denotes, sense, evokes, > concept and lexicalized sense is demonstrated in the example below > for the case of a lexical resource such as WordNet. > > OntoLex/Lemon has a much simpler usage, removing many elements > that were in LMF > > The following example gives an example of a sense relation: > > Proposal: > > The following example illustrates a sense relation: > > The following example shows how to model the relation between > "Food and Agriculture Organisation" and its initialism "FAO" as > one example of a lexical relation > > Proposal: > > The following example shows how t_o model the lexical relation > between_ "Food and Agriculture Organisation" and its initialism "FAO" > > In the introductory paragraph to Syntactic Frames, we think it > should be: stand *on *their own, and not *by *their own > In the definition of Syntactic Frame, the definite article "the" > is missing in "in terms of the (syntactic) arguments" > > A comma is missing in the sentence below "... the preposition > in*,* ..." > The following example shows how to specify that the intransitive > verb/operate/, subcategorizing a prepositional phrase introduced > by the preposition/in/can be used to denote the > propertyregionServed <http://dbpedia.org/ontology/regionServed>in > DBpedia > > In the next sentence, examples should be in singular: > The following example*s* shows how to use thesubmap > <http://www.w3.org/community/ontolex/wiki/Final_Model_Specification#Submap>property > to indicate that the meaning of the phrase "X launched Y in Z" is > a composition of the propertiesdbpedia:product > <http://dbpedia.org/ontology/product>anddbpedia:productionStartYear <http://dbpedia.org/ontology/productionStartYear>, > which together express the meaning of the syntactic frame > > In the following sentence below example 6, quantifier should be > quantifying > "Indicating that an argument is optional means that it does not > have to be realized syntactically in which case from a semantic > point of view the corresponding semantic argument is existentially > quantifier over." > > In the definition of Optional, we would avoid the use of > "optional" in the explanation, and say instead: The optional > property indicates that a syntactic argument can be omitted. > The*optional*property indicates whether a syntactic argument is > optional, that is, it can be syntactically omitted. > > In example 7 (Optional): a slash is missing, see: > > ontolex:reference<http:/ontology.org/giving> <http:/ontology.org/giving>; > > BTW, is http://ontology.org/giving correct??? > > In example 9 there is a mispelling in Transportation, see: > > :methodOfTransporation a rdf:Property ; > > Is this example complete? shouldn't it be pointing to an ontology?? > > Below example decomp/example 2 > Revise the following sentence (verb is at the end...) > "It is important to note that the subterm property does not > indicate the position or even which words a subterm is." > > > > Finally, we see that sometimes the names of classes or properies > have hyperlinks, but not always. Which should be the criterion to > follow? > See for example the paragraph below in which regionServed is > sometimes hyperlinked, others highlighted in bold, or not > highlighted at all (dbpedia:regionServed). > > "The following example shows how to specify that the intransitive > verb/operate/, subcategorizing a prepositional phrase introduced > by the preposition/in/can be used to denote the > propertyregionServed <http://dbpedia.org/ontology/regionServed>in > DBpedia. The entry specifies that in a construction such as `X > operates in Y', the X refers to the subject of the property > dbpedia:regionServed, and the Y refers to the object of the > property*regionServed*. Again, we use theLexInfo > <http://www.lexinfo.net/>ontology in our example to provide > linguistic information:" >> >> >> -- >> -- >> Prof. Dr. Philipp Cimiano >> AG Semantic Computing >> Exzellenzcluster für Cognitive Interaction Technology (CITEC) >> Universität Bielefeld >> >> Tel:+49 521 106 12249 <tel:%2B49%20521%20106%2012249> >> Fax:+49 521 106 6560 <tel:%2B49%20521%20106%206560> >> Mail:cimiano@cit-ec.uni-bielefeld.de >> <mailto:cimiano@cit-ec.uni-bielefeld.de> >> >> Office CITEC-2.307 >> Universitätsstr. 21-25 >> 33615 Bielefeld, NRW >> Germany >> >> > > > -- > Elena Montiel-Ponsoda > Ontology Engineering Group (OEG) > Departamento de Inteligencia Artificial > ETS de Ingenieros Informáticos > Campus de Montegancedo s/n > Boadilla del Monte-28660 Madrid, España > www.oeg-upm.net <http://www.oeg-upm.net> > Tel.(+34) 91 336 36 70 <tel:%28%2B34%29%2091%20336%2036%2070> > Fax(+34) 91 352 48 19 <tel:%28%2B34%29%2091%20352%2048%2019> > > -- -- Prof. Dr. Philipp Cimiano AG Semantic Computing Exzellenzcluster für Cognitive Interaction Technology (CITEC) Universität Bielefeld Tel: +49 521 106 12249 Fax: +49 521 106 6560 Mail: cimiano@cit-ec.uni-bielefeld.de Office CITEC-2.307 Universitätsstr. 21-25 33615 Bielefeld, NRW Germany
Received on Tuesday, 8 September 2015 06:35:52 UTC