W3C home > Mailing lists > Public > public-rdf-wg@w3.org > August 2011

Re: Plain vs. xsd:string literals in RDFa

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 23 Aug 2011 11:57:46 +0100
Cc: W3C RDFWA WG <public-rdfa-wg@w3.org>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <E03491C1-5BCF-46B1-ADAC-31923F3242BB@cyganiak.de>
To: Gregg Kellogg <gregg@kellogg-assoc.com>
Hi Gregg,

I can speak only for myself, so please consider this a personal response only. If you would like to bring this officially to the attention of the RDF WG, then please advise one of the RDF WG's chairs, who can raise this as an issue for the RDF WG and put it on the agenda.

On 22 Aug 2011, at 23:18, Gregg Kellogg wrote:
> Richard, on today's RDF Web Apps WG call we discussed the implications of the RDF WG's decision to collapse Plain Literals with xsd:string typed literals, and the implications for RDFa. Concerns were raised about backwards compatibility issues and the effect on the RDF(a) API when providing triples through a call-back, which is nominally a representation of the abstract syntax. This is expressed in ISSUE-101 [1]

First let me say this. The RDF 2004 abstract syntax and the RDF 1.1 abstract syntax are two different data models. They are obviously extremely similar, but they differ in details.

So, the question arises for all specifications that use or build on RDF 1.1: Are they compatible with RDF 2004 only? RDF 1.1 only? Both? Do they need adaptions or modifications to make them usable with the RDF 1.1 data model?

One design goal for us was that any *syntax* designed for RDF 2004 should be usable with RDF 1.1 without change. More on that below.

Note there's an Editor's Draft of RDF 1.1 Concepts:
http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-Graph-Literal

RDF 1.1 Concepts is in early draft state, and I think it would be inappropriate to treat it as anything else before it becomes a W3C Recommendation. In terms of process, I think it would be unwise to build any other spec *exclusively* on RDF 1.1 before that date.

> These are the specific issues we're considering:
> 	• Should RDFa Processors generate "foo"^^xsd:string from now on, or just the plain literal?

The following sentence from the RDF Concepts Editor's Draft is supposed to explain what's going on with old syntaxes that are not written with RDF 1.1 in mind:

[[
Concrete syntaxes may support *simple literals*, consisting of only a lexical form without any language tag or datatype IRI. Simple literals only exist in concrete syntaxes, and are treated as syntactic sugar for abstract syntax typed literals with the datatype IRI http://www.w3.org/2001/XMLSchema#string. Simple literals and language-tagged literals are collectively known as *plain literals*.
]]

I believe this explains everything one needs to know in order to use RDFa as currently specified with the RDF 1.1 data model.

If the RDFAW WG prefers to add some language that explains the RDF 1.1 changes directly, rather than relying on the mechanism of the paragraph above, then you could add something in this spirit, and personally I would encourage you to do it, at least informatively:

[[
RDFa is compatible with RDF 2004 and RDF 1.1. RDF 1.1 states that a plain literal without language tag is syntactic sugar for an xsd:string typed literal. Thus, an RDF 1.1 compatible RDFa processor MUST generate an xsd:string typed literal wherever the specification calls for the generation of a plain literal without language tag.
]]

(This “MUST” doesn't really change anything, because in RDF 1.1, a plain literal without language tag and an xsd:string typed literal are considered the same thing anyways.)

I don't know if there is such a thing as an RDFa serializer, but if that is to be covered in the spec, then it could say:

[[
When serializing an RDF 1.1 graph as RDFa, a serializer MAY use either the @datatype form or the plain literal form when serializing xsd:string literals.
]]

The MAY could be a SHOULD for the plain form. It's really up to the concrete syntax to decide if either form is ok, or one form should be preferred, or only one form should be allowed. That really depends on the environment that the concrete syntax is designed to work in.

> 	• Should the RDFa API callback produce a "foo" property with an "xsd:string" datatype, or just a "foo" plain literal property? Note that this is not a serialization, so the rules about not using xsd:string markup might not apply

Personally I think that RDF APIs are very similar to RDF serialization formats, because both provide concrete representations of the abstract RDF syntax.

So this is for the RDFWA WG to decide. The question seems to be if an xsd:string typed literal in the abstract syntax ought to be “serialized” as xsd:string typed, or with “syntactic sugar” that turns it into a plain native string.

I will just mention how this plays out in Turtle:

123 in Turtle has always been syntactic sugar for "123"^^xsd:integer. 123.45 has always been syntactic sugar for "123"^^xsd:decimal. true has always been syntactic sugar for "true"^^xsd:boolean. In the same way, "foo" will be syntactic sugar for "foo"^^xsd:string.

So, if xsd:integer numbers come out of the RDFa API as native numbers without an RDF datatype, then xsd:strings should maybe come out as native strings without an xsd:string datatype too.

(Caveat: I have not followed the RDF API work at all. I'm working under the assumption that it's something like the Jena API but for Javascript.)

> 	• If we must produce an "xsd:string", how should we handle the case where something like Raptor is using a library that uses the new "foo"^^xsd:string form, with a library that thinks there is a difference between xsd:string and a plain literal?
> Case in point, Raptor uses librdfa and could output to RDF/XML. Something that used to be output as a plain literal will now be output as an xsd:string, which will inevitably break code. Is this fine?
> From my understanding, we could choose either to represent both plain literals and xsd:string typed literals using either an un-datatyped literal, or a typed literal [2].
> 
> One of the concerns is the behavior of existing libraries, which currently do differentiate between these different forms.

I have trouble making sense of the paragraphs above, I find it hard to follow what you mean by “producing” and “outputting”. Could you perhaps re-state this with an example?

> Also, for other languages (e.g. RDF/XML) which are not likely to be updated with this requirement. This could lead to inconsistent behavior between languages and implementations.

Again, the paragrpah quoted above from RDF Concepts should explain how RDF/XML works when used with the RDF 1.1 data model.

If you or the RDFWA WG feel that RDF/XML ought to be updated, rather than relying on the generic paragraph quoted above, then this would be useful feedback to the RDF WG.

> Changing RDFa to generate xsd:string typed literals would be a backwards-incompatible change,

Building RDFa exclusively on RDF 1.1 instead of RDF 2004 would be a backwards-incompatible change. Ideally it would support both once RDF 1.1 is finished.

> and backends or APIs which were previously expecting a literal without a datatype would know be faced with those literals having a datatype.

Yes -- that's the major incompatibility when switching from RDF 2004 to 1.1.

Hope that helps. Again, this is just a personal response.

Best,
Richard



> The group would like some guidance on this issue, and wondered if you or the RDF WG might address some of these issues and provide some guidance.
> 
> Gregg
> 
> [1] http://www.w3.org/2010/02/rdfa/track/issues/101
> [2] http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/AbolishUntaggedPlain
Received on Tuesday, 23 August 2011 10:58:11 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:44 GMT