W3C home > Mailing lists > Public > public-rdf-wg@w3.org > May 2011

Re: Proposal for ISSUE-12, string literals

From: Steve Harris <steve.harris@garlik.com>
Date: Tue, 17 May 2011 10:06:43 +0100
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <89190029-FC9C-42F2-B0A4-A0B2B1B3F611@garlik.com>
To: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
This idea seems to have some merit to me.

It strikes me as a little confused semantically - I'm not sure that integer / byte has a similar relationship to French / English, but as a self confessed "scruffy" it's less gubbins to express the same information, which is a win.

So, I'm guessing as a formulation that rdflang:en would be a subtype of xsd:string, and rdflang:en-GB would be a subtype of rdflang:en, and so on?

A few practical considerations:

1) ISO language codes are not case sensitive, IRIs are. "foo"@fr = "foo"@FR, "foo"^^rdflang:fr != "foo"^^rdflang:FR. We'd need to define a canonical case for the datatype form.

2) Should systems prefer language tags, or datatypes in external data? i.e. is "kludge"@en-GB the canonical form, or is it "kludge"^^rdflang:en-GB ? This affects RDF serialisations, and for e.g. SPARQL results. ^^ seems the most obvious choice in one sense, but it's more bytes, so less obvious in another.

3) What about rdf:PlainLiteral? Would this proposal make it obsolete?

4) Is the value space all UTF-8 strings? If not, is it a type error to write "מחשב"^^rdflang:en? 

- Steve

On 2011-05-17, at 07:53, Pierre-Antoine Champin wrote:

> Hi all,
> 
> here's another idea:
> 
> why not consider language tags as special datatypes?
> In other words,
> 
>  "chat"@en
> 
> would be a shortcut for something like
> 
>  "chat"^^rdflang:en
> 
> (even if the above notation could be forbidden in serialization
> syntaxes, alla rdf:PlainLiteal)
> 
> this would
> * make everything much more regular
> * while matching the current behaviour (a literal could not possibly
> have a "language" datatype and another datatype)
> * and make it more natural (in my view) to unify language-less literals
> with xsd:string.
> 
> Also, it seems to me that upper layers (SPARQL, programming APIs) could
> continue working as they do (their current behaviour can easily be
> emulated on top of this new model) and smoothly evolve to align to the
> new model.
> 
>  pa
> 
> 
> On 05/14/2011 03:34 PM, Pat Hayes wrote:
>> 
>> On May 13, 2011, at 4:47 PM, Steve Harris wrote:
>> 
>>> On 2011-05-13, at 21:49, Pat Hayes wrote:
>>> ...
>>>> Advantages: Gives a type to plain literals; preserves rdf:PlainLIteral specs (extending them, but not contradicting them); allows people to use plain literals without getting involved with trailing @; and allows xsd:string to be deprecated in favor of plain literal syntax (or the reverse, of course.) 
>>>> 
>>>> Disadvantages: might be thought too complicated; takes the notion of type slightly outside the current RDF datatype specs.  
>>>> 
>>>> Thoughts?
>>> 
>>> A lot of this complexity seems to stem from trying to make "foo" be an xsd:string. Instead why no go with Plan A and make "foo"^^xsd:string a plain literal.
>> 
>> I prefer that also. But there are still some issues remaining with this step. (1) people want a 'type' for plain literals, and (b) plain literals can have language tags, which breaks current RDF datatyping. The proposal is more trying to deal with this while keeping faithful to existing RDF syntax and also the rdf:PlainLIteral work.
>> 
>> Pat
>> 
>>> 
>>> xsd:strings are significantly rarer than plain literals in realworld RDF data (in my experience), so it's less weird overall to de-type xsd:strings, than to try and add a type to every plain literal.
>>> 
>>> It's not the prettiest solution but probably RDF shouldn't have had explicit xsd:strings in the first place.
>>> 
>>> - Steve
>>> 
>>> -- 
>>> Steve Harris, CTO, Garlik Limited
>>> 1-3 Halford Road, Richmond, TW10 6AW, UK
>>> +44 20 8439 8203  http://www.garlik.com/
>>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>>> 
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973   
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Tuesday, 17 May 2011 09:07:25 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:42 GMT