- From: Nicola Carboni <nicola.carboni@uzh.ch>
- Date: Thu, 9 May 2019 08:54:45 +0200
- To: Julia Bosque Gil <jbosque@fi.upm.es>, Frances Gillis-Webber <fran@fynbosch.com>
- Cc: public-ontolex@w3.org
- Message-Id: <F689964E-0042-4276-BBF2-58192131520B@uzh.ch>
> On 7 May 2019, at 15:52, Julia Bosque Gil <jbosque@fi.upm.es> wrote: > > > You can include information about a specific region, script or variant using language tags and subtags [1, 2, 3] with the ontolex:writtenRep. In that respect, the OntoLex Spec provides more links in this part: > > Furthermore, we require that instances of the model adhere to the RDF 1.1 specification <http://www.w3.org/TR/rdf11-concepts/> and follow the appropriate guidelines. In particular, we require that language tags adhere to Best Common Practice 47 <http://www.rfc-editor.org/rfc/bcp/bcp47.txt>, where tags are made up of a language code (based on ISO 639 codes part 1, 2, 3 or 5 <http://www.iso.org/iso/home/standards/language_codes.htm>), optionally followed by a hyphen and a ISO 3166-1 <http://www.iso.org/iso/iso-3166-1_decoding_table.html> country code. Language tags may also contain further subtags <https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry> expressing e.g. the region, script or further variants. > > In the Lexicography page [4] (Issue 5) we discussed that ontolex:usage could be applied to cases in which you want to specify that a sense of an entry is attested in (only) a particular region. Another option is the use of lexvo:usedIn (with range lexvo:GeographicRegion) [5], but the property is described as The property of a language or writing system [emphasis added] being used somewhat extensively in a particular geographical region at some point in time (although the domain is not restricted). In our work with K Dictionaries in 2016 [6] we decided to create a custom property kd:geographicalUsage for this, but I would say that opting for the language tag option, whenever possible, would be preferable. > > Hope this helps :) > It definitively help! Thank you Julia. While language tags are quite useful, and I will employ them, they are (a normal limit for a finite list ) a bit limiting in respect to very small community of speakers. Having to work on a model that covers also such cases, I was looking to ground the information in respect to geographical regions. I was not aware of the lexvo ontology, so thank you very much for it. It seems to resolve partially my problem (however it does imply the declaration of a n instance of a language for using the property, which is not fully what I want). On another note, I searched a bit for the KD vocabulary extension you mentioned in the article but I could not find any links. Do you have one? > On 7 May 2019, at 17:23, Frances Gillis-Webber <fran@fynbosch.com> wrote: > > Hi Nicola > > My colleagues and I have approached it in two different ways: > > (1) Encoding the spatial data in the language tag, when language-tagging a string literal > > Each latlon coordinate can be converted to a geohash (can be low precision), and then the region can be represented as a polygon. For a polygon, the first and last coordinate is the same, so we have excluded the last coordinate from the string, separating each geohash with a "--". This string can be included in the privateuse portion of the language tag. > > We have described the solution in detail in the paper [1]. > > (2) Modelling the language data > > In the Ontolex-Lemon specification, a language is modelled as follows: > > <lexical entry> dct:language <to a language code URI> > > However, in place of the language code URI, you could use your own URI, and then model the geographic data from there. > > We have created a lightweight ontology for language annotation called MoLA, and have accounted for both custom language tags and regions in the model. The solution is described in [2]. > > The ontology is here: http://ontology.londisizwe.org/mola <http://ontology.londisizwe.org/mola> > > I'm currently working on the specification so I can supply proposed modelling using MoLA, if you like? Hi Frances this is definitively interesting. Would you mind sending me the article about approach 1? I would like to use wkt, but your solutions seems pretty straightforward and very useful at an application level. Regarding solution 2 it seems to work almost perfectly for me because I can use to describe language information in time and space (thank you for the link with wgs84! :-) ), and it does describe the several layers of variances I need. Thank you for it, it seems pretty straightforward, so no need for the documentation! Best, Nicola > On 7 May 2019, at 17:23, Frances Gillis-Webber <fran@fynbosch.com> wrote: > > Hi Nicola > > My colleagues and I have approached it in two different ways: > > (1) Encoding the spatial data in the language tag, when language-tagging a string literal > > Each latlon coordinate can be converted to a geohash (can be low precision), and then the region can be represented as a polygon. For a polygon, the first and last coordinate is the same, so we have excluded the last coordinate from the string, separating each geohash with a "--". This string can be included in the privateuse portion of the language tag. > > We have described the solution in detail in the paper [1]. > > (2) Modelling the language data > > In the Ontolex-Lemon specification, a language is modelled as follows: > > <lexical entry> dct:language <to a language code URI> > > However, in place of the language code URI, you could use your own URI, and then model the geographic data from there. > > We have created a lightweight ontology for language annotation called MoLA, and have accounted for both custom language tags and regions in the model. The solution is described in [2]. > > The ontology is here: http://ontology.londisizwe.org/mola <http://ontology.londisizwe.org/mola> > > I'm currently working on the specification so I can supply proposed modelling using MoLA, if you like? > > Kind regards, > Frances > > [1] Accepted at LDK 2019: The Shortcomings of Language Tags for Linked Data when Modeling Lesser-Known Languages (F. Gillis-Webber & S. Tittel) (I can send you a PDF) > [2] Accepted at KGSWC 2019: A Model for Language Annotations on the Web (F. Gillis-Webber, S. Tittel and C.M. Keet). PDF: http://www.meteck.org/files/KGSWC19mola.pdf <http://www.meteck.org/files/KGSWC19mola.pdf> > > > > On Tue, 7 May 2019 at 15:54, Nicola Carboni <nicola.carboni@uzh.ch <mailto:nicola.carboni@uzh.ch>> wrote: > Dear ontolex community, > > I am currently modelling some data using ontolex and lexinfo. > However, I have some doubts on how to relate a a lexical entry to a specific spatial area. My intent is to declare that an entry is being used in a specific region in a country or in limited spatial area (a valley, or in a small island for example). > > I was wondering if anyone had faced the same challenge and which are the adopted solutions to the problem. > > Best, > > Nicola > > > > > -- > Nicola Carboni > Research Fellow > University of Zurich Post Box 23 > Ramistrasse 71 8006 Zurich > Switzerland > > > >
Received on Thursday, 9 May 2019 06:55:20 UTC