Re: R2RML - SQL string representations from Richard Cyganiak on 2011-11-21 (public-rdb2rdf-wg@w3.org from November 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Mon, 21 Nov 2011 21:43:53 +0000
To: David McNeil <dmcneil@revelytix.com>
Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-Id: <122537E7-123D-4E05-8C9D-0A95CFB61434@cyganiak.de>

On 21 Nov 2011, at 19:48, David McNeil wrote:
> On Mon, Nov 21, 2011 at 12:03 PM, Richard Cyganiak <richard@cyganiak.de> wrote:
>> On 21 Nov 2011, at 17:24, David McNeil wrote:
>>> The R2RML spec [1] has language in section 10.3, that prescribes the mapping of SQL string values to plain literals. This surprises me, I would expect SQL strings to map to XSD strings. My understanding is that the RDF group is moving away from plain literals. What do you think of changing this to map SQL strings to XSD strings?
>> 
>> The current behaviour should remain IMO. In the wild, plain literals are more common than xsd:string typed literals by an order of magnitude or so (I have the numbers somewhere). We should optimize for the common case.
> 
> I don't find this a compelling argument.

Well, then how about those:

- SPARQLing for plain literals is easier on users
- Plain literals save significant bandwidth
- Plain literals support i18n, xsd:string doesn't

>> And it's already easy to generate xsd:strings simply by specifying the rr:datatype.
> 
> I would rather not require the user to add this to define the obvious, natural mapping of a column. 

What's obvious and natural is in the eye of the beholder. I, for one, never understood why RDF supported two different redundant ways of encoding strings, and would have been very happy if xsd:string had been forbidden in RDF.

>> In RDF 1.1, there is no difference between plain literals and xsd:string typed literals. They are the same thing. So it's not a move away from plain literals, it's just abolishing a distinction. If R2RML were to target RDF 1.1, then it wouldn't matter if we said “generate a plain literal” or “generate an xsd:string typed literal” – it would result in the same graph. But as long as we target the old RDF, the distinction matters.
> 
> My understanding is that it is something more like: the direction is to use typed literals rather than plain literals.

No, I can say with some authority that this is not the case. The decision was to make them *the same* in RDF 1.1 so that the silly distinction doesn't matter any more. There is *no* implication in this decision that one of the forms should be preferred over the other in pre-RDF-1.1 systems where the distinction still exists.

> To be in keeping with this spirit R2RML would produce strings as xsd:string typed literals.

If we were targeting RDF 1.1 then this is what we should be doing; but we don't, so decisions of the RDF WG do not affect us. (Unless we paint ourselves into a forward compatibility corner, but this isn't the case here.)

Best,
Richard

Received on Monday, 21 November 2011 21:44:26 UTC