Re: Rethinking ISSUE-12 with lang datatypes from Ivan Herman on 2011-05-27 (public-rdf-wg@w3.org from May 2011)

From: Ivan Herman <ivan@w3.org>
Date: Fri, 27 May 2011 11:49:24 +0200
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-wg@w3.org
Message-Id: <E3CF319C-1582-46BE-9EFC-FAB8F91F541B@w3.org>

On May 27, 2011, at 11:23 , Andy Seaborne wrote:

> 
> 
> On 25/05/11 17:50, Antoine Zimmermann wrote:
>> All,
>> 
>> 
>> [disclaimer: I am not vehemently in favour of that proposal, just expressing my thoughts aloud.]
> 
> In the same spirit: just thinking aloud.

Ditto

> 
> One of the limitations of datatypes is that lexical space is a 1D, the set of sequences of characters.  If we generalise datatypes for RDF to a "representation space" which can be multi-dimensional, we can formulate and relate language tagged datatypes quite simply.
> 
> Restricting the representation space to 1D space of strings, we get back to lexical space and compatibility with XSD etc.
> 
> rdf:String is a datatype where the rep space is
>    (unicode strings) union (unicode strings, validLangTags)
> The value space is <string> union <string,validLangTags>
> 
> rdf:LangTaggedString is a derived datatype of rdf:String, restricting the  represenation space to (unicode strings, validLangTags).
> 
> rdf:lang{langTag} is a derived datatype of rdf:LangTaggedString, restricting the representation space to (unicode strings, {langTag})

But, I believe, the alternative idea was slightly different. If we remove rdf:LangTaggedString from the equation altogether, and we keep only the rdf:lang-{langtag} as a series of datatypes, then the representation space is simply unicode strings plus a specific datatype. Ie, just like we have

"1"^^xsd:integer
"1"^^xsd:double

that are (afaik) disjoint as different, we would have

"a"^^rdf:lang-en
"a"^^xsd:string

different. 

"a" is a shortcut for "a"^^xsd:string
"a"@en is a shortcut for "a"^^rdf:lang-en

there is a question whether we would define rdf:lang-en as a subtype (subclass) of xsd:string; and it seems to be safer not to do that. 

SPARQL str() 

returns the unicode string and drops the datatype for all combination.

Ivan

> 
> "foo"@en is special syntax ("foo", "en").
> (c.f. 123 for "123"^^xsd:string)
> 
> SPARQL str() is defined to return the first element of a tuple.
> 
> Then rdf:PlainLiteral is datatype with a 1D lexical space, encoding using "@" as a separator.
> 
> (Does it say anywhere in RDF that derived datatypes must be subclasses?)
> 
> 	Andy
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Friday, 27 May 2011 09:47:15 UTC