Re: Simplified proposal for string literals

* Steve Harris <steve.harris@garlik.com> [2011-05-18 09:32+0100]
> On 2011-05-18, at 02:18, Eric Prud'hommeaux wrote:
> 
> > * Pat Hayes <phayes@ihmc.us> [2011-05-17 17:57-0500]
> >> As my proposed extension to rdf:PlainLIteral seems to have fallen on deaf ears, allow me to suggest a simplified version of it which might be more acceptable. There are two versions. In the first, plain literals are no longer strings. so the current equivalence between "string" and "string"^^xsd:string no longer applies. The second keeps this equivalence. 
> > 
> > +1 to Version B though I don't see any reason not to say "RDF *MUST*
> > NOT use xsd:string as a datatype in typed literals."
> 
> RDF concrete syntax, or abstract? 
> 
> Version B still seems to have some of the problems of just forcing all untyped literals to be xsd:string, you still have some issues with internal representation in systems, and for e.g. with SPARQL Result format.

Any system which can currently express "abc" and "abc"^^xsd:string will be able to express "abc". SPARQL systems currently return a datatype of xsd:string for "abc". The pain point is that existing data in databases will have to change any instances of "abc"^^xsd:string and existing parsers will have to parse "abc"^^xsd:string to produce "abc", but those are exactly the steps we need to take in order to solve this problem.

The only effect of MAY instead of MUST is to reduce the number of places the problem gets solved. If we are not ambitious enough to tell people to make this change (and aesthetics make "abc"^^xsd:string pretty rare in the wild), then we can leave the semantics alone and just turn down the flow of potential xsd:strings, i.e. "Creators of RDF data should always use plain literals instead of literals of type xsd:string."


> It's also quite a complex representation, for a relatively simple problem.
> 
> - Steve
> 
> >> Veraion A
> >> 
> >> 1.  rdf:PlainLIteral is a unique special datatype, built into basic RDF (along with rdf:XMLLIteral) with a special, unique formulation. It applies to plain literal syntax, which is thought of as specifying a pair of a string and a language tag. If no language tag is present, then the language tag of the literal is 'NULL'. The L2V mapping of this datatype takes the pair <string, tag> to itself, ie it is the identity mapping on these pairs. 
> >> Put another way, the datatype value of "string" is <string, NULL> and of "string"@tag is <string, tag>. 
> >> Every plain literal in RDF has the datatype rdf:PlainLIteral, even though this name is not used explicitly in the literal syntax. 
> >> 
> >> 2. rdf:PlainLIteral MUST NOT be used as an explicit datatype name in any RDF literal of the form "string"^^datatype. LIterals of the form "string@tag"^^rdf:PlainLiteral MUST be rewritten as a plain literal "string"@tag or flagged as an error.
> >> 
> >> 3. "string" is no longer sameAs "string"^^xsd:string (the first has a NULL language tag, the second has no tag at all.) 
> >> 
> >> Version B
> >> 
> >> 1.  rdf:PlainLIteral is a unique special datatype, built into basic RDF (along with rdf:XMLLIteral) with a special, unique formulation. It applies to plain literal syntax, which is thought of as specifying either a character string, or a pair of a string and a language tag.  The L2V mapping of this datatype takes both strings and pairs <string, tag> to themselves, ie it is the identity mapping on strings and on pairs. 
> >> Put another way, the datatype value of "string" is  string  and of "string"@tag is <string, tag>. 
> >> Every plain literal in RDF has the datatype rdf:PlainLIteral, even though this name is not used explicitly in the literal syntax. 
> >> 
> >> 2. rdf:PlainLIteral MUST NOT be used as an explicit datatype name in any RDF literal of the form "string"^^datatype. LIterals of the form "string@tag"^^rdf:PlainLiteral MUST be rewritten as a plain literal "string"@tag or flagged as an error.
> >> 
> >> 3. "string" and "string"^^xsd:string are equivalent, so to avoid equality reasoning, the datatype xsd:string is deprecated in RDF. RDF SHOULD NOT use xsd:string as a datatype in typed literals, and applications MAY rewrite any literal typed with xsd:strong as a plain literal with no language tag. 
> >> 
> >> --------
> >> 
> >> Either way, this keeps existing plain literal syntax exactly as it is at present, does not require anyone to rewrite any up-front code, and retains the rdf:PlainLIteral typing without getting involved with the trailing-@ messiness. It  requires one exception in the RDF semantics to allow this slightly nonstandard datatype, but I don't think this is of any importance at all, especially as the L2V mapping is so trivial. It will require short changes to Concepts and Semantics, and a quick check over Testcases, but we will be doing this anyway. 
> >> 
> >> FWIW, I marginally prefer  version B, as it settles the xsd:string business once and for all. But only marginally.
> > 
> > I note that Version B means we don't conflict with SPARQL Query, which
> > says that datatype("abc") == xsd:string
> > [[
> > Returns the datatype IRI of typedLit; returns xsd:string if the
> > parameter is a simple literal.
> > ]] — http://www.w3.org/TR/sparql11-query/#func-datatype
> > and eliminates an obstructive arbitrary choice presented to data designers.
> > 
> > 
> >> Pat
> >> 
> >> 
> >> 
> >> ------------------------------------------------------------
> >> IHMC                                     (850)434 8903 or (650)494 3973   
> >> 40 South Alcaniz St.           (850)202 4416   office
> >> Pensacola                            (850)202 4440   fax
> >> FL 32502                              (850)291 0667   mobile
> >> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> > 
> > -- 
> > -ericP
> > 
> 
> -- 
> Steve Harris, CTO, Garlik Limited
> 1-3 Halford Road, Richmond, TW10 6AW, UK
> +44 20 8439 8203  http://www.garlik.com/
> Registered in England and Wales 535 7233 VAT # 849 0517 11
> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
> 

-- 
-ericP

Received on Wednesday, 18 May 2011 13:53:05 UTC