Re: Proposal for ISSUE-12, string literals

On 05/18/2011 10:37 AM, Steve Harris wrote:
> On 2011-05-17, at 21:01, Pierre-Antoine Champin wrote:
>> sorry, some second thoughts
>> On 05/17/2011 09:03 PM, Pierre-Antoine Champin wrote:
>>> On 05/17/2011 11:06 AM, Steve Harris wrote:
>> <snip/>
>>>> So, I'm guessing as a formulation that rdflang:en would be a subtype
>>>> of xsd:string,
>> as far as I understand, currently "chat"^^xsd:string ≠ "chat"@en and
>> "chat" ≠ "chat"@en, and more generally no xsd:string or simple literal
>> is equal to a plain literal with language tag. So the respective
>> datatypes should have disjoint value spaces, hence no subtype relation.
>>>> and rdflang:en-GB would be a subtype of rdflang:en, and
>>>> so on?
>> I'm not even sure "en-GB" is a valid language tag, reading [1]:
> It's a region subtag, see
> So, yes it is a valid language tag.

I meant "valid language tag *in RDF*" of course. But I guess the URL you
refer to can apply to RDF as well (as language tags in RDF are obviously
inherited from xml:lang).

>>  Note: When using the language tag, care must be taken not to confuse
>>  language with locale. The language tag relates only to human language
>>  text. Presentational issues should be addressed in end-user
>>  applications.
>> [1]
>> but if it is, literals with @en-GB" are disjoint from literals with @en
>> and so the respective datatypes should be disjoint as well.
> It's not quite that simple. @en matches @en-GB, but they're not equal c.f.
> and

Do they *match* in the sense of the model theory? In other words, does

  :a :b "chat"@en-GB .


  :a :b "chat"@en .

in any entailment regime defined by the RDF semantics ??

I don't think so, which does not mean that it is not an interesting
thing to consider —although it looks like a tricky can of worms...

In any case, I don't think that this entailment would mean that
rdflang:en would be a supertype of rdflang:en-GB, as their value space
would still be disjoint, in my view.


>>>> A few practical considerations:
>>>> 1) ISO language codes are not case sensitive, IRIs are. "foo"@fr =
>>>> "foo"@FR, "foo"^^rdflang:fr != "foo"^^rdflang:FR. We'd need to define a
>>>> canonical case for the datatype form.
>>> I hadn't thought of that either, but yes, canonical case sounds like the
>>> right thing to do.
>> and according to [1] again, the language tag is normalized to lowercase
>> in the abstract syntax.
> OK, that's easy.
>> <snip />
>>>> 4) Is the value space all UTF-8 strings? If not, is it a type error
>>>> to write "מחשב"^^rdflang:en?
>>> well, currently I guess any UTF-8 string is valid. So yes, the value
>>> space would of all those datatypes would be all UTF-8 strings, if only
>>> for the sake of BC (and because I sure don't want to walk down that path...)
>> sorry, I was reading "lexical space".
>> The value space would be isomorphic to the set of UTF-8 strings, but
>> different for each "language datatype". Defining it as the set of pair
>> <text, language-tag> as in RDF Semantics seems like a good option.
> Sounds reasonable.
> - Steve

Received on Wednesday, 18 May 2011 09:07:35 UTC