Re: Proposal for ISSUE-12, string literals from Alex Hall on 2011-05-13 (public-rdf-wg@w3.org from May 2011)

From: Alex Hall <alexhall@revelytix.com>
Date: Fri, 13 May 2011 14:20:57 -0400
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Pat Hayes <phayes@ihmc.us>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <BANLkTikgS+_g8-dPhrMqrzyOxHZOB1EZow@mail.gmail.com>

On Fri, May 13, 2011 at 1:42 PM, Richard Cyganiak <richard@cyganiak.de>wrote:

> On 13 May 2011, at 16:52, Alex Hall wrote:
> >> I think the sensible way would be:
> >> 1) every literal has *both* a datatype and a (possibly empty) language
> tag;
> >> 2) of the built-in datatypes, only xsd:string can have non-empty
> language tags;
> >> 3) plain literals and rdf:PlainLiterals don't exist;
> >> 4) "foo" in concrete syntaxes is syntactic sugar for "foo"^^xsd:string.
> >> 5) "foo"@en in concrete syntaxes is syntactic sugar for
> "foo"^^xsd:string@en.
> >>
> ...
> > The main roadblock that I can see is that a datatype maps a single
> lexical string to a value; you'd have to define a special notion of
> datatyping for xsd:string which is essentially an identity mapping of
> <lexical, lang> pairs.  Otherwise you'd have "chat"^^xsd:string@en and
> "chat"^^xsd:string@fr with the same value, which won't fly.
>
> Yes, that's right, RDF Semantics would have to be adapted to ensure that
> "foo"@en and "foo"@fr (which are now syntactic sugar for
> "foo"^^xsd:string@en and "foo"^^xsd:string@fr) are still different. But I
> think that's doable:
>
> Let's write "xxx"^^yyy for a typed literal with *empty* language tag. Its
> interpretation is L2V("xxx"), where L2V is the lexical-to-value mapping of
> datatype yyy.
>
> Let's write "xxx"^^yyy@zzz for a typed literal with *non-empty* language
> tag. Its interpretation is <L2V("xxx"), zzz>.
>
> How exactly to distribute that logic between Simple Entailment and
> D-Entailment requires some thought. You can't remove plain literals from RDF
> without changing a couple lines of RDF Semantics ...
>

Looks reasonable enough to me.  I would actually prefer this approach to the
one I suggested previously if it can be made to work without too many
painful contortions.  My suggestion of either datatype or lang-tag was an
attempt to change the abstract syntax in a way that would require no (or at
least minimal) changes to RDF Semantics.


>
> This entire proposal breaks backwards compatibility in two ways:
>
> 1. The following Turtle file would now contain only one triple instead of
> two:
>
>   <a> <b> "foo", "foo"^^xsd:string .
>
> This obviously has some serious knock-on effects, for example SPARQL stores
> that have already loaded this file now need to drop a triple, which changes
> the results of many queries.
>
> 2. In SPARQL, datatype("foo"@en) would now report xsd:string instead of ø.
> That seems like a good thing to me (it's explainable by saying that the
> language tag is “attached” to the “outside” of the typed literal). I believe
> this is *fairly* unlikely to cause interoperability issues with existing
> queries.
>

3. There are probably countless instances of the following logic deployed in
existing systems (I cut-and-pasted this directly from my own code):

public Literal(String lexicalValue, String language, URI datatype) {
  ...
  if (language != null && datatype != null) {
    throw new IllegalArgumentException("An RDF literal may not have both a
language and a datatype.");
  }
  ...
}

I'm willing to change my code, but I can't speak for everybody...

-Alex



>
> Best,
> Richard

Received on Friday, 13 May 2011 18:21:25 UTC