- From: Boris Motik <boris.motik@comlab.ox.ac.uk>
- Date: Thu, 17 Jul 2008 09:54:22 +0100
- To: <public-owl-wg@w3.org>
Hello, At yesterday's teleconf, Ivan drew my attention to the proposal by Axel Polleres for owl:internationalizedString: http://lists.w3.org/Archives/Public/public-owl-wg/2008Jul/0223.html If I understood the idea correctly, Axel is proposing to have a datatype per language tag. Thus, you'd have something like lang:en datatype, which would contain all strings in English (lang: is a namespace prefix yet to be defined). Furthermore, you might have the lang:en-US datatype, which would contain all strings in the US variant of English. The datatype lang:en-US would be a subdatatype of lang:en; hence, if you asked for all strings in English, you would obtain also all strings in the US variant as well. Please correct me if I summarized the proposal incorrectly -- I apologize in advance. I'm not really sure what the value space of all these datatypes would be. If you want to make literals of the form "aaa"@en and "aaa"@en-US be different things (i.e., if you want to give them different identity), then you need to have different objects in the value space. Axel's e-mail is silent about the value spaces; however, I assume that each literal with a language tag is still mapped to a pair of the form (text,langTag). If this were not the case -- for example, if you mapped "aaa"@en and "aaa"@en-US to the same object "aaa" -- then there would be no way you can distinguish different values in the interpretation of lang:en and lang:en-US. Hence, it seems reasonable for me to assume that the value space of datatypes in the Axel's proposal is identical to the value space of my proposal in (http://lists.w3.org/Archives/Public/public-owl-wg/2008Jul/0306.html). Furthermore, Axel's proposal is silent about the treatment of xsd:string. Since the value spaces in my and his proposal are the same, however, I don't see any problem in mapping literals of the form "aaa"^^xsd:string into ("aaa","") -- that is, into pairs with the empty value tag. In fact, it seems to me that Axel's proposal is more related to ISSUE-71, which asks for a mechanism for identifying all strings in a particular language. My proposal hasn't so far addressed this issue at all. In fact, I believe that ISSUE-71 is orthogonal to the problem of structuring the value space of internationalized strings (which is the main goal of ISSUE-126). To be more precise, I believe that, if we addressed ISSUE-126 in the way I outlined earlier, there would be nothing preventing us from employing Axel's proposal for addressing ISSUE-71. The only thing we need to do is define the value spaces for of each of different lang:* datatypes. For example, the value space of lang:en would be defined as the set of pairs of the form ("*","en[-*]") (I hope you understand my pidgin regular expressions). To summarize, I believe we can go forward with ISSUE-126 and come back to ISSUE-71 later. Regarding Axel's proposal for addressing ISSUE-71, it seems quite reasonable. I would like, however, to point out that ISSUE-71 can be addressed in a rather simple way by simply adding another facet langTagPattern. This facet would take a regular expression and would restrict the value space of owl:internationalizedString to the set of pairs in which the language tag matches the regular expression. For example, the datatype restriction DatatypeRestriction( owl:internationalizedString langTagPattern "en[-*]" ) would have as the value space the set of pairs of the form ("*","en[-*]") and would thus select all strings written in some variant of English. In contrast, DatatypeRestriction( owl:internationalizedString langTagPattern "en" ) would have as the value space the set of pairs of the form ("*","en") and would select only the strings that have no sublanguage specified. The regular expressions would thus provide us with quite a bit of flexibility; in particular, it would allow us to explicitly distinguish between values with no language tags, only the language tag, language+sublanguage lag, and so. I believe also the proposal would be really simple to implement: the extensions to the datatype reasoning algorithm from my ISWC 2008 paper are rather trivial. Regards, Boris Regards, Boris
Received on Thursday, 17 July 2008 08:56:00 UTC