Re: [GRAYMAIL] Re: regular date for telecons

On morphology, +1 on OLiA, http://www.acoli.informatik.uni-frankfurt.de/resources/olia/#olia

OLiA ontologies - uni-frankfurt.de<http://www.acoli.informatik.uni-frankfurt.de/resources/olia/#olia>
www.acoli.informatik.uni-frankfurt.de
OLiA ontologies . The Ontologies of Linguistic Annotation (OLiA) are a repository of linguistic data categories used for corpus annotation, Natural Language ...




Also, attention should be given to MAF, the morphological annotation framework (ISO 24611).


I have been using OLiA for XML-based work across different languages (English, Greek, Latin, Coptic, and Syriac), and have encountered numerous lacunae, or places where one could/should question the taxonomy or the definition of a particular grammatical feature. But I believe that's OLiA's strength -- the enormous vocabulary provides a basis for distinguishing between commonly agreed concepts and ones that aren't, and for registering disagreement. Linguists don't agree even on the taxonomy of English. Witness OLiA's reluctance to synthesize all of the Brown Corpus tag set, where Brown's conjunction is not matched with olia:conjunction, I think because the Brown Corpus guidelines allowed some words that are only now widely regarded as prepositions (e.g., "because") to be tagged as conjunctions.


Best wishes,


jk

--
Joel Kalvesmaki
Editor in Byzantine Studies
Dumbarton Oaks
1703 32nd St. NW
Washington, DC 20007
(202) 339-6435
________________________________
From: Gilles Sérasset <gilles.serasset@imag.fr>
Sent: Tuesday, January 3, 2017 8:02:59 AM
To: John McCrae
Cc: Jorge Gracia; Philipp Cimiano; public-ontolex@w3.org
Subject: [GRAYMAIL] Re: regular date for telecons

Hi all and Happy new Year,

I must say that I am also in favour of these topics (even if I do not have much time to work on this).

Concerning Lexico-syntactic categories, I went to Olia for DBnary, but found some (very few) gaps to encode some entries (missing cases/genders, etc.) especially to encode the extensive morphology (set of alternative forms, annotated with morpho-syntactic features). But this may be due to my personal lack of knowledge for linguistic theory.

Concerning morphology module, we indeed need to identify one or maybe several morphology ontology that could be used to intentionally describe the morphology of some languages.

Concerning Diachronic Module, I worked with Ester Pantaleo (under a MediaWiki grant) to extract etymology from wiktionary and we really lack an ontology of etymological relations and a set of best practices on etymology encoding. I did not see any published ontology in this aspect but I read some articles/work on this recently (https://hal.inria.fr/hal-01296498/<http://cp.mcafee.com/d/5fHCMUq41AidEIfcIcfefFCXCQrzDXFLffndET7fTjud7bdPqdNPZQTDDAjqdNMVNYse79COKDCvEHFmgy2AaWlFM04SxniJe00CZohQwUY_R-hhvKCqekkuLsKCCMVOtS6nC61TkhjmKCHuX7axVZicHs3jq9JASzRQCUkr8kJ6h3Pq9IDeqR4IMDF_ix6mYX72uDBXEFCzBNwQsLormr87OInH2tj_3OYv5C_Fc092HsbunaWup-yKBp29O5taQU022zVkCjr8sopU02rhpvd7b1I5-Aq88-hrfS9EwgSvCy1SIjh1fP_Jm53qpEVdOoJX9CvL8zWF>) . We also had problems with the identification of languages (ancients + variants) that lexvo does not contain. We may extract a set of all languages mentioned in etymology section (at least in the English language Edition).

Regards,

Gilles,


On 03 Jan 2017, at 11:44, John McCrae <john@mccr.ae<mailto:john@mccr.ae>> wrote:

Hi all,

I think the most requested features for OntoLex Lemon (at least to my knowledge) are as follows:


  1.  Lexico-syntactic categories: That is all the categories such as part-of-speech, gender, case, etc. should be standardised by some procedure agreed by the community. Currently this is done by LexInfo but the process for proposing and correcting changes should be more open.
  2.  Morphology Module: The Monnet Lemon model had a morphology module, but it was seen as quite insufficient and was not widely adopted. A better system for modelling morphology, perhaps based on Bettina Klimek's MoOn ontology could help in some use cases
  3.  Diachronic (historical) Module: The OntoLex Lemon model has no real way of representing etymology or historical usages of terms. I believe there were some suggestions in this direction from both Fahad Khan and Christian Chiarcos.

Regards,
John

On Thu, Dec 15, 2016 at 9:53 AM, Jorge Gracia <jgracia@fi.upm.es<mailto:jgracia@fi.upm.es>> wrote:
Dear Philipp,

> I propose we move the teleconference to the new year to January 12th, 16:00 CET

Fine with me

> Meanwhile, I would like to prepare the teleconference by asking you all to share some thought on modules or
> extensions to the ontolex model you would like to see discussed in the group next year.

As suggested in a previous email, I'd like to propose a module for lexicography [1]. We could analyse how to proceed with this in the next telco.

Best regards, and happy Christmas holidays,

Jorge

[1] https://lists.w3.org/Archives/Public/public-ontolex/2016Nov/0014.html<http://cp.mcafee.com/d/k-Kr6wUp3zqb3Pb33PzWpKVJ6UV-WrPPRPqdNPZQTzhOPsSzss_tdVVV4Szssesv73xOpIHFVDWaWlA8wF2KBqs01dElQHjw09Lm4t8effZvAknXFCzB57HTbFFIesDtxBVxwtR4kRHFGTKNOEuvkzaT0QSCrpdEZt9K56O5bhAgYSyr9PCJhbcfBiteccY01MgzoD8PZc2Pp0-lImr87OInH2tj_3YjKyU3aCsLt5zwsrmr87OInH2tj_3OYv5C_Fc092HsbunaWup-yKBp29O5taQU022zVkCjr8sopU02rhpvd7b1I5-Aq88-hrfS9EwgSvCy1SIjh1fP_Jm53qpEVdSpsd1i9s>



--
Jorge Gracia, PhD
Ontology Engineering Group
Artificial Intelligence Department
Universidad Politécnica de Madrid
http://jogracia.url.ph/web/<http://cp.mcafee.com/d/2DRPoOrhoupoousvjdTdET7fTjuuuKrhKevKCYqemrCQrzDXFLff8CQrzxPzUUsejdBtfc_hniIx458lQHjw09J2KBqs01dWMzF1NV_HYyy_tcQsEEZuVtddxPAXIcLcc3KEyCJtdmZSel3PWApmU6CSjr9J7HFdMESgFqcy7CQjo0chGhzJFrPsj-m9U0b3qPp0-lyZojGvUunzUITZ9w18lrxrOVnjPfQlQH8hegHFmD00gkvaAOrp3z3f00jqbbVEVodwLQzh17Obp-Nd426PYQgeRyq89-vZGMErjd79KiPbQ8coMF7b>

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.

Received on Tuesday, 3 January 2017 15:11:33 UTC