W3C home > Mailing lists > Public > public-rdf-wg@w3.org > May 2011

Re: Proposal for ISSUE-12, string literals

From: Richard Cyganiak <richard@cyganiak.de>
Date: Thu, 12 May 2011 11:19:27 +0100
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <27DEFC50-57DB-4F54-9A47-E44459FA1827@cyganiak.de>
To: Ivan Herman <ivan@w3.org>
On 12 May 2011, at 09:22, Ivan Herman wrote:
> - You make the remark on the wiki page on 'extending this to numeric literals', which I would rather say 'extending this to any datatype' (eg, xsd:dateTime, too).

Right -- I changed the section heading on the wiki.

> I have the impression that this is also a consequence of what you write already. You emphasize the 'lexical equality', and you also say "Implementations MAY replace any literal with a canonical form if both are syntactically different, but have the same value." which does not look like being bound to string literals.

The way I wrote it, the only literals marked as canonical forms are plain string literals. So the sentence doesn't license replacement of, say, +00013 with 13, because no numeric literals have been marked as canonical forms. That could be easily changed, of course.

> Do you think there is anything missing in this document to make that picture complete (except, editorially, to possibly add non-string examples)?

If we only want to address string literals, then I think the proposal is complete.

If we want to address other XSD literals as well, then some bullet points should be added to the list of equalities, and the canonical lexical form of some XSD datatypes (e.g., "13.0"^^xsd:decimal) should be defined to be canonical forms so that other same-valued literals can be replaced with the canonical form. This requires a detailed reading of the XSD spec (which I have not done so far).

(RDF Concepts should probably contain a paragraph or two introducing the rdf:PlainLiteral datatype and referencing the relevant spec, but let's treat that as a separate issue.)

> - I would also propose to make some tiny changes in the Semantics document.

I'll let the editors of that document comment.

Best,
Richard


> 
> Ivan
> 
> 
> On May 11, 2011, at 23:23 , Richard Cyganiak wrote:
> 
>> I took an action today to draft text for RDF Concepts that resolves ISSUE-12. I put it on the wiki here:
>> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal
>> A plain text copy is attached below.
>> 
>> Best,
>> Richard
>> 
>> 
>> 
>> SHORT SUMMARY
>> 
>> 1. RDF Concepts puts more emphasis on the distinction between (syntactic) “literal equality” and (semantic, important for applications) “value equality”
>> 2. RDF Concepts explicitly points out the specific string value equalities that already arise from RDF Semantics
>> 3. RDF Concepts declares one of the string literal forms as canonical
>> 4. Implementations MAY canonicalize, but don't have to
>> 5. The canonical form is plain literals.
>> 
>> 
>> WHY?
>> 
>> 1. No changes to the abstract syntax required
>> 2. No changes to any concrete syntax or parser required
>> 3. No changes to any implementations of any of the existing entailment regimes required
>> 4. Those who are ok with canonicalization can do that, and don't need to deal with entailment
>> 5. Those who don't want to canonicalize, have the option of supporting only string value equality at query time, without RDFS- and D-Entailment
>> 6. “MAY canonicalize” softly discourages the use of xsd:string typed literals, without abolishing them outright or declaring them archaic
>> 7. Standardizing on xsd:string was never an option because of language tags
>> 8. Standardizing on rdf:PlainLiteral was never an option because it MUST NOT be used in serializations that support plain literals
>> 
>> 
>> CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal
>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Value
>> 
>> 
>> §1 Rename it to “6.5.1 The Value Corresponding to a Literal” and move it ahead of 6.5.1
>> 
>> §2 Add to the beginning:
>> “The value of a plain literal without language tag is the same Unicode string as its lexical form.
>> 
>> The value of a plain literal with language tag is a pair consisting of 1. the same Unicode string as its lexical form, and 2. its language tag.
>> 
>> For typed literals, …” (continue with rest of section as is)
>> 
>> §3 Remove the Note at the end of the section
>> 
>> 
>> CHANGES TO 6.5.1 Literal Equality
>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality
>> 
>> 
>> §4 Rename section to “6.5.2 Literal Equality and Canonical Forms”
>> 
>> §5 Add to the beginning:
>> “Equality of literals can be evaluated based on their syntax, or based on their value.”
>> 
>> §6 Change “Two literals are equal …” to: “Two literals are syntactically equal …” in the current first paragraph.
>> 
>> §7 Add to the end:
>> “In application contexts, comparing the values of literals (see section 6.5.1) is usually more helpful than comparing their syntactic forms. Literals with different lexical forms and with different datatypes can have the same value. In particular:
>> 
>> - A plain literal with lexical form aaa and no language tag has the same value as a typed literal with lexical form aaa and datatype IRI xsd:string
>> - A plain literal with lexical form aaa and no language tag has the same value as a typed literal with lexical form aaa@ and datatype IRI rdf:PlainLiteral
>> - A plain literal with lexical form aaa and language tag xx has the same value as a typed literal with lexical form aaa@xx and datatype IRI rdf:PlainLiteral”
>> 
>> §8 “Some literals are canonical forms. Implementations MAY replace any literal with a canonical form if both are syntactically different, but have the same value. All plain literals, with or without language tag, are canonical forms.”
>> 
>> 
>> CHANGES TO 6.3 Graph Equivalence
>> http://www.w3.org/TR/rdf-concepts/#section-graph-equality
>> 
>> 
>> §9 Append this leftover sentence, which was removed from 6.5.1:
>> “Note: For comparing RDF Graphs, semantic notions of entailment (see [RDF-SEMANTICS]) are usually more helpful than the syntactic equivalence defined here.”
>> 
>> 
>> EXTENDING THIS TO NUMERIC LITERALS???
>> 
>> (While we're at it, we might also cover equalities between the built-in numeric XSD types, and between different lexical forms of the same built-in XSD datatype.)
> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 
> 
Received on Thursday, 12 May 2011 10:20:00 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:42 GMT