W3C home > Mailing lists > Public > public-rdf-wg@w3.org > May 2011

Re: subtypes of xsd:string

From: David Wood <david.wood@talis.com>
Date: Wed, 25 May 2011 20:25:47 -0400
Cc: Richard Cyganiak <richard@cyganiak.de>, Jeremy Carroll <jeremy@topquadrant.com>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <2F7328C3-7C85-4A62-8D37-499DE67B5F77@talis.com>
To: Ivan Herman <ivan@w3.org>
Hi all,

I'm still noodling about the language tags on literals.  Earlier today I proposed on IRC that language tagged strings could be a subtype of xsd:string, but we didn't get a chance to address it.

It seems to me that RDF literals, literals with language tags and xsd:strings have always been messy.  If all three became xsd:strings of one form or another, that would clarify the situation nicely.  The problem would appear to be the amount of work involved in making such a deep change.

The proposal would be to:

- remove plain literals from the abstract syntax, as Richard suggested; all plain literals would parse as xsd:strings.

  "foo" -> "foo"^^xsd:string

- xsd:strings themselves would remain untouched.

- add a subtype of xsd:string for language tagged strings;

  "foo"@en -> "foo"^^xsd:LanguageTaggedString@en or some such.



On May 25, 2011, at 05:50, Ivan Herman wrote:

> There will be some effects on implementations; not serious but non zero. I, sort of, ran through in my head what RDFLib should do (being a python user):
> 1. Literal objects, when initialized language, the datatype attribute should be set to the rdf:Text (or whatever that will be) (currently set to None)
> 2. Literal objects, when initialized without language, the datatype attribute should be set to the xsd:string (currently set to None)
> 3. Some checks have to be changed (the current implementation sets the lang attribute to none if a datatype is provided, but that should not happen if the object initialization gets rdf:Text)
> 4. The comparison operator should be modified
> 5. All serializers should modify their behaviour: at the moment, when they see a datatype, they simply produce "..."^datatype. This would still be fine with xsd:string if that is allowed but, according to the proposal, this should _not_ happen with rdf:Text. Ie, all serializers must have an extra branch to handle this case.
> Nothing hugely complicated, but does require some work. I foresee #5 to be the most complicated in practice, due to the plugin architecture RDFLib has (meaning that applications may have their own serializers). I guess all other RDF environment will have to perform roughly the same steps.
> I.
> On May 25, 2011, at 11:26 , Richard Cyganiak wrote:
>> On 25 May 2011, at 00:46, Jeremy Carroll wrote:
>>> it may be odd if we effectively deprecate xsd:string for RDF (surface syntax ...) but still have subtypes lying around ...
>> The proposal does not call for the deprecation (effective or not) of xsd:string.
>> The proposal is to remove plain literals from the abstract syntax.
>> In concrete syntaxes, both "foo" and "foo"^^xsd:string forms would still be allowed. The former would be syntactic sugar, just like 1 and "1"^^xsd:integer in Turtle.
>> There should be some language to the effect that "foo" is preferred, simply for ergonomic reasons. I phrased this as a SHOULD in the proposal. Weaker language might be sufficient in the general case. Or maybe expressing this preference is altogether unnecessary.
>> Some syntaxes have use cases that are hampered by the variability introduced by syntactic sugar. N-Triples and SPARQL Results XML/JSON, mostly. I think these syntaxes should make a stronger statement in their respective syntax spec. Perhaps forbid one of the forms when serializing. Which one doesn't really matter.
>> Best,
>> Richard
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 26 May 2011 00:26:20 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:04:07 UTC