W3C home > Mailing lists > Public > public-rif-comments@w3.org > July 2008

Re: I18N issues an OWL2

From: Axel Polleres <axel.polleres@deri.org>
Date: Mon, 14 Jul 2008 10:47:32 +0100
Message-ID: <487B20B4.1080907@deri.org>
To: Felix Sasaki <fsasaki@w3.org>
CC: Ivan Herman <ivan@w3.org>, "Phillips, Addison" <addison@amazon.com>, Jie Bao <baojie@cs.rpi.edu>, "public-owl-wg@w3.org" <public-owl-wg@w3.org>, "public-i18n-core-comments@w3.org" <public-i18n-core@w3.org>, "public-rif-comments@w3.org" <public-rif-comments@w3.org>, Boris Motik <boris.motik@comlab.ox.ac.uk>

Felix Sasaki wrote:
> Ivan Herman さんは書きました:
>>
>>
>> Axel Polleres wrote:
>> [snip]
>>
>>> Sure!
>>>
>>> As for the namespace, I personally prefer rdf:  sharing jos' 
>>> arguments here that it is in my opinion NOT problematic to do so. 
>>> Several rdf: namespaced properties already do not have a specified 
>>> formal semantics (the reification having been mentioned already, so 
>>> what).
>>>
>>
>> Yes, that is indeed a good point.
>>
>> [snip]
>>
>>>
>>>  A probably more feasible solution would be to do a real type hierarchy,
>>> for language tags and - instead of a datatype 
>>> owl:internationalizedString or rif:text which has pairs of strings 
>>> and language tags as lexical space - define separate datatypes and 
>>> (subtypes) for each lang-tag, ie.
>>>
>>> use:
>>>
>>> message("Hello"^^lang:en-US)
>>>
>>> where e.g. lang:en-US is a subtype of lang:en, i.e.
>>> that would also imply
>>>
>>> message("Hello"^^lang:en)
>>>
>>>  (just as xsd:integer is a subtype of xsd:integer of xsd:decimal in 
>>> the XML Schema type hierarchy, see 
>>> http://www.w3.org/TR/xmlschema-2/#built-in-datatypes)
>>>
>>> Anything wrong with that? To me this seems much cleaner than this 
>>> fiddling around with pairs of strings and lang-tags.
>>>
>> [snip]
>>
>> This is indeed quite nice, I must say. Addison already referred to one 
>> caveat that I intended to raise, namely the possibly high number of 
>> language tags (by the way, [1] gives a fairly readable overview of 
>> those). Let us see where that discussion goes...
> 
> 
> This caveat might be a severe problem of this approach. The BCP 47 
> language tags are relying on a generate approach using the ABNF in BCP 
> 47 (so-called "well formed" language tags), and in addition the registry 
> of sub tags. I'm not sure if it will be feasible to put these two types 
> of conformance in relation to the planned OWL2 data type hierarchy, 
> though I think it would be highly desirable ...

The question is more then, whether we still want to go for the somewhat 
crooked detour of having language tags outside the datatypes? I mean, in 
what sense does the generic datatype rif:text or 
owl:internationalizedText*) solve the problem instead of just hiding it?

BCP 47 says: "Subtags are distinguished and separated from one another 
by a hyphen ("-", ABNF [RFC4234] %x2D)."

So, why could a lang: datatype hierarchy not simply state that the 
hierarchy is defined *implicitly*. We don't need to list this hierarchy 
explicitly, but could just define:

   <i>lang:tag1</i> is a supertype of </i>lang:tag2</i> if and only if
   <i>tag1</i> is a prefix of <i>tag2</i>, where both  <i>tag1</i> and
   <i>tag2</i> are both valid language tags, following [BCP 47].

Maybe, I am oversimplifying things here, but I really don't understand 
the deep problem with this approach - which probably there is, but I'd 
appreciate if someone could point me explicitly. Would it be a problem 
if all these datatypes would have the same lexical space?

Thanks for clarification,

Axel

*) no objection about coinflipping as suggested by Ian here, btw, if we 
want to stick with it

>>
>> Another issue is that we have to see is how well this works with the 
>> OWL design (I have explicitly added Boris on the cc list to draw his 
>> attention:-). My understanding of the current datatype restriction 
>> design[2] is that one can define facets for a specific datatype, but 
>> not across several datatypes; on the other hand in this proposal the 
>> datatype for 'en-us' and 'en-gb' would be different and both would be 
>> different from 'en' (although 'en-us' and 'en-gb' would both be 
>> subtypes of 'en'). How could I define facets that involves all these? 
>> Would that work well with the OWL design? I actually hope we can find 
>> a way, because the usage of these URI-s looks quite elegant...
>>
>> Cheers
>>
>> Ivan
>>
>>
>> [1] http://www.w3.org/International/articles/language-tags/
>> [2] http://www.w3.org/2007/OWL/wiki/Syntax#Datatype_Restrictions

-- 
Dr. Axel Polleres, Digital Enterprise Research Institute (DERI)
email: axel.polleres@deri.org  url: http://www.polleres.net/

Everything is possible:
rdfs:subClassOf rdfs:subPropertyOf rdfs:Resource.
rdfs:subClassOf rdfs:subPropertyOf rdfs:subPropertyOf.
rdf:type rdfs:subPropertyOf rdfs:subClassOf.
rdfs:subClassOf rdf:type owl:SymmetricProperty.
Received on Monday, 14 July 2008 09:48:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 14 July 2008 09:48:23 GMT