Re: xml:lang and XML infoset: two new datatypes from Patrick Stickler on 2002-09-23 (w3c-rdfcore-wg@w3.org from September 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Mon, 23 Sep 2002 20:20:11 +0300
To: "ext Sergey Melnik" <melnik@db.stanford.edu>
Cc: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>, "RDF Core" <w3c-rdfcore-wg@w3.org>
Message-ID: <001301c26325$78f13a50$2280720a@NOE.Nokia.com>

[Patrick Stickler, Nokia/Finland, (+358 50) 483 9453, patrick.stickler@nokia.com]

----- Original Message ----- 
From: "ext Sergey Melnik" <melnik@db.stanford.edu>
To: "Patrick Stickler" <patrick.stickler@nokia.com>
Cc: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>; "RDF Core" <w3c-rdfcore-wg@w3.org>
Sent: 23 September, 2002 18:33
Subject: Re: xml:lang and XML infoset: two new datatypes

> Patrick Stickler wrote:
> 
> > I've tried to make the following point before, and will try again.
> > 
> > The datatype of a literal is disjunct from any xml:lang 
> > attribution, and a literal can be specified for both. E.g.
> > 
> >    xsd:string"This string is not a valid token."-en
> >    xsd:token"moi"-fi
> 
> 
> Language-tagged strings in RDF are not subtypes of xsd:strings, and 
> aren't subtypes of xsd:tokens either. Is that what you are claiming? 

Yes. Datatypes are formal languages with very precise intepretations
of their lexical representations. And we often mix formal and natural
language intepretations for the same lexical representation. See below.

> In 
> mathematical terms, there is no total injective function from 
> language-tagged strings to xsd:string...

I'll take your word for it ;-)

>  
> > Thus, it is not always the case that the datatype for language
> > qualified literals is xsd:string. It may be some subtype of
> > xsd:string or other string type, and the specific datatype
> > is of course significant.
> 
> 
> No, it cannot be a subtype of string. It can only be a derived type 
> defined over a cross-product xsd:string x xsd:string.

Ummm, isn't a derived type a subtype? Perhaps we need a mini-glossary
for this discussion.

> 
> > And although the xml:lang does not affect the L2V mapping
> > and is ignored by the datatyping machinery, it still is
> > relevant to applications.
> 
> 
> Of course, therefore language-tagged strings should be a separate 
> datatype. I don't see any utility of making the language attribution 
> orthogonal to datatyping.

I've already given one example of how one may wish to constrain
property values to datatypes other than xsd:string and still
specify the language in question.

We may wish to say that the datatype for a given property (or literal)
is a token list, or an XML name, or some other string-derived
datatype, yet also state that values of that property (or the 
particular literal) has meaning according to a particular language.

These two types of attribute, datatype and language, are disjunct; even
though there are similarities in the machinery of datatyping and 
natural language.

And though there is not presently any notable interest (that I can
tell) for datatyping XML literals, should we do so, then the disjunct
relationship of complex (structured) typing and language attribution
becomes even clearer.

Patrick

Received on Monday, 23 September 2002 13:23:01 UTC