W3C home > Mailing lists > Public > public-ontolex@w3.org > March 2016

Re: [open-linguistics] Question: replacing language codes in a SPARQL BIND statement?

From: Christian Chiarcos <chiarcos@informatik.uni-frankfurt.de>
Date: Sun, 13 Mar 2016 18:49:06 +0100
To: "A list for those interested in open data in linguistics." <open-linguistics@lists.okfn.org>, "John McCrae" <john@mccr.ae>
Cc: public-ontolex <public-ontolex@w3.org>
Message-ID: <op.yd9yf4hd89jat0@kitaba.sitecomwlr2100>

> As far as I know there is no provision in SPARQL for querying ignoring  
> the language literal. In RDF at least "cat", "cat"@en and "cat"@en-GB  
> are all >different values. Perhaps you could ask this question on a list  
> like public-lod@w3.org or semantic-web@w3.org?

Of course, but there, this is probably considered a pathological case of  
marginal relevance. Before I suggest extending the semantics of BIND to  
cover this particular problem, I'd like to be sure I didn't miss anything  
;)

Best,
Christian

> On Sun, Mar 13, 2016 at 11:09 AM, Christian Chiarcos  
> <chiarcos@informatik.uni-frankfurt.de> wrote:
>> Dear all,
>>
>> this is a general technical question, albeit one specific to working  
>> with multilinguality issues in multiple lemon/ontolex dictionaries,  
>> hence I'm asking >>here in the first place.
>>
>> Imagine the following situation: I use the Russian DBnary (provided in  
>> a slightly extended variant of the old lemon) and an ontolex dictionary  
>> for >>Chalkan (with Russian glosses). Both provided by third parties,  
>> and I do not want to manipulate the data prior to querying. Now, I want  
>> to use DBnary >>to retrieve an English gloss for the Chalkan words in a  
>> single SPARQL query.
>>
>> If both dictionaries use the same xml:lang representation, this works  
>> rather well (I skip the query for reasons of brevity): I bind the  
>> Russian gloss from >>the Chalkan dictionary to variable ?ru and start  
>> searching DBnary for a data property that assigns ?ru as literal.
>>
>> It is more complicated, though, if both files use different language  
>> codes, e.g., ISO-639-3 (rus) and ISO-639-2 (ru) for Russian, or if a  
>> language code with >>region sub-tag is used (e.g., ru-RU). Is there any  
>> way to use, say, BIND to bind the string value of ?ru to a new variable  
>> which uses ISO-639-2 codes >>instead of the original ISO-639-3 (resp.  
>> ISO-639-2+ISO-3166) code?
>>
>> At the moment, I see only one way to solve this problem, i.e., using  
>> FILTER, str() and a string comparison of both variables. This should be  
>> fairly >>inefficient, though, as I presume the FILTER is applied only  
>> after all potential bindings for both variables for Russian terms have  
>> been determined.
>>
>> Am I overlooking anything?
>>
>> Best,
>> Christian
>> --Prof. Dr. Christian Chiarcos
>> Applied Computational Linguistics
>> Johann Wolfgang Goethe Universitšt Frankfurt a. M.
>> 60054 Frankfurt am Main, Germany
>>
>> office: Robert-Mayer-Str. 10, #401b
>> mail: chiarcos@informatik.uni-frankfurt.de
>> web: http://acoli.cs.uni-frankfurt.de
>> tel: +49-(0)69-798-22463
>> fax: +49-(0)69-798-28931
>> _______________________________________________
>> open-linguistics mailing list
>> open-linguistics@lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/open-linguistics
>> Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics
>



-- 
Prof. Dr. Christian Chiarcos
Applied Computational Linguistics
Johann Wolfgang Goethe Universitšt Frankfurt a. M.
60054 Frankfurt am Main, Germany

office: Robert-Mayer-Str. 10, #401b
mail: chiarcos@informatik.uni-frankfurt.de
web: http://acoli.cs.uni-frankfurt.de
tel: +49-(0)69-798-22463
fax: +49-(0)69-798-28931
Received on Monday, 14 March 2016 20:16:22 UTC

This archive was generated by hypermail 2.3.1 : Monday, 23 October 2017 10:57:39 UTC