W3C home > Mailing lists > Public > public-rdf-wg@w3.org > March 2011

Re: RDF-ISSUE-12 (String Literals): Reconcile various forms of string literals (time permitting) [Cleanup tasks]

From: Steve Harris <steve.harris@garlik.com>
Date: Sun, 6 Mar 2011 07:59:52 +0000
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <BB26DD38-3D65-4F8A-9F7E-98418127C5E8@garlik.com>
To: Pat Hayes <phayes@ihmc.us>
On 2011-03-06, at 06:11, Pat Hayes wrote:
> On Mar 5, 2011, at 5:26 PM, Steve Harris wrote:
>> On 2011-03-05, at 15:24, Pat Hayes wrote:
>>> On Mar 5, 2011, at 5:35 AM, RDF Working Group Issue Tracker wrote:
>>>> RDF-ISSUE-12 (String Literals): Reconcile various forms of string literals (time permitting) [Cleanup tasks]
>>>> http://www.w3.org/2011/rdf-wg/track/issues/12
>>>> Raised by: Ivan Herman
>>>> On product: Cleanup tasks
>>>> At the moment we have plain literals, rdf:plainLiteral, and xsd:string literals. They are very very close to one another but they are officially different. In practice this means that, eg, SPARQL queries have to have a three branch UNION to handle all of these. Worth looking at some sort of a reconciliation of these.
>>> +100 
>>> We really should clean up this mess. I suggest a draconian solution: deprecate all but xsd:string. Untyped literals were just a mistake, it seems clear from hindsight. rdf:plainLiteral was a brave attempt to clean up the mess, but it is a crock because it had to work within the existing specs. We have a chance to put all this right. 
>>> We can allow language tags on xsd:string literals and we can even allow the plain literal syntax to stay, but treat it as syntactic sugar for an xsd:string literal. And we can incorporate xsd:string datatyping into plain RDF entailment. All of this is inelegant at the theoretical level (but no more than having XMLLIteral in there) but supremely practical, since the entire world knows what xsd:string means and uses xsd typing. 
>> I like the idea of merging plain literals an xsd:string in some way.
>> For reasons of brevity I'd like the plain literal syntax to be kept. Possibly plain literal syntax should even be the canonical form, as plain literals are found far more often than xsd:string-s in current RDF documents.
> This was the reasoning behind plain literals in the first place, and it is what got us into this mess. The brevity point is a textbook example of premature optimization. The problem is that for many purposes, it is MUCH better if all literals have a type. Plain literals don't have a type, which breaks many interfaces (OWL2, RIF among them.) So, rdf:plainLiteral was invented to be the type of these things that don't have a type, so that they would have a type. But they already are identical to xsd:string typed literals, which have a type but a different type. So now we have THREE ways of saying the same thing, with TWO different types, and strange rules about which is preferred and how some of them are treated by some engines as syntactic sugar for others but not by other engines, and some inference regimes sanction inferences from one to the other, and so on. If there was ever a clear case of a mess that needs cleaning up, this is surely one of them. Reverting to RDF 1.0 is not a solution.

I think maybe I'm not making myself clear, I was only talking about the syntax, so if a system sees x, it would be interpreted as y:

   "foo"^^xsd:string  ->  "foo"^^xsd:string
   "foo"  ->  "foo"^^xsd:string
   "foo"@de  ->  "foo"^^xsd:string @de

This would effectively deprecate plain literals as there would be no way to write one down anymore, and it would change the result of some existing data / SPARQL combinations, so it's not without implications.

- Steve

Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Sunday, 6 March 2011 08:00:33 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:04:03 UTC