W3C home > Mailing lists > Public > public-rdf-wg@w3.org > May 2011

Re: Proposal for ISSUE-12, string literals

From: Ivan Herman <ivan@w3.org>
Date: Thu, 12 May 2011 14:06:37 +0200
Cc: antoine.zimmermann@insa-lyon.fr, public-rdf-wg@w3.org
Message-Id: <C63B0ED9-E9EA-49A2-AFD9-FE05E80BE522@w3.org>
To: Richard Cyganiak <richard@cyganiak.de>

On May 12, 2011, at 13:52 , Richard Cyganiak wrote:

> On 12 May 2011, at 11:36, Antoine Zimmermann wrote:
>> The XSD specifications define the canonical form of xsd:boolean, xsd:decimal, xsd:float, xsd:double, xsd:dateTime, xsd:time, xsd:date, xsd:hexBinary, xsd:integer, xsd:nonPositiveInteger, xsd:negativeInteger, xsd:long, xsd:int, xsd:short, xsd:byte, xsd:nonNegativeInteger, xsd:unsignedLong, xsd:unsignedInt, xsd:unsignedShort, xsd:unsignedByte, xsd:positiveInteger.
>> 
>> RDF could simply rely on these definitions.
> 
> +1.
> 

Me too.

> But I'd be tempted to go further and make only the primitive types such as xsd:decimal into RDF canonical forms. This would mean that systems MAY canonicalize all numbers to a single numeric datatype.
> 

Do you mean like the 'canonical' forms in Turtle? I may miss something here.

Ivan



> Best,
> Richard
> 
> 
> 
>> 
>> Le 12/05/2011 12:19, Richard Cyganiak a écrit :
>>> On 12 May 2011, at 09:22, Ivan Herman wrote:
>>>> - You make the remark on the wiki page on 'extending this to
>>>> numeric literals', which I would rather say 'extending this to any
>>>> datatype' (eg, xsd:dateTime, too).
>>> 
>>> Right -- I changed the section heading on the wiki.
>>> 
>>>> I have the impression that this is also a consequence of what you
>>>> write already. You emphasize the 'lexical equality', and you also
>>>> say "Implementations MAY replace any literal with a canonical form
>>>> if both are syntactically different, but have the same value."
>>>> which does not look like being bound to string literals.
>>> 
>>> The way I wrote it, the only literals marked as canonical forms are
>>> plain string literals. So the sentence doesn't license replacement
>>> of, say, +00013 with 13, because no numeric literals have been marked
>>> as canonical forms. That could be easily changed, of course.
>>> 
>>>> Do you think there is anything missing in this document to make
>>>> that picture complete (except, editorially, to possibly add
>>>> non-string examples)?
>>> 
>>> If we only want to address string literals, then I think the proposal
>>> is complete.
>>> 
>>> If we want to address other XSD literals as well, then some bullet
>>> points should be added to the list of equalities, and the canonical
>>> lexical form of some XSD datatypes (e.g., "13.0"^^xsd:decimal) should
>>> be defined to be canonical forms so that other same-valued literals
>>> can be replaced with the canonical form. This requires a detailed
>>> reading of the XSD spec (which I have not done so far).
>>> 
>>> (RDF Concepts should probably contain a paragraph or two introducing
>>> the rdf:PlainLiteral datatype and referencing the relevant spec, but
>>> let's treat that as a separate issue.)
>>> 
>>>> - I would also propose to make some tiny changes in the Semantics
>>>> document.
>>> 
>>> I'll let the editors of that document comment.
>>> 
>>> Best, Richard
>>> 
>>> 
>>>> 
>>>> Ivan
>>>> 
>>>> 
>>>> On May 11, 2011, at 23:23 , Richard Cyganiak wrote:
>>>> 
>>>>> I took an action today to draft text for RDF Concepts that
>>>>> resolves ISSUE-12. I put it on the wiki here:
>>>>> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal
>>>>> 
>>>>> 
>> A plain text copy is attached below.
>>>>> 
>>>>> Best, Richard
>>>>> 
>>>>> 
>>>>> 
>>>>> SHORT SUMMARY
>>>>> 
>>>>> 1. RDF Concepts puts more emphasis on the distinction between
>>>>> (syntactic) “literal equality” and (semantic, important for
>>>>> applications) “value equality” 2. RDF Concepts explicitly points
>>>>> out the specific string value equalities that already arise from
>>>>> RDF Semantics 3. RDF Concepts declares one of the string literal
>>>>> forms as canonical 4. Implementations MAY canonicalize, but don't
>>>>> have to 5. The canonical form is plain literals.
>>>>> 
>>>>> 
>>>>> WHY?
>>>>> 
>>>>> 1. No changes to the abstract syntax required 2. No changes to
>>>>> any concrete syntax or parser required 3. No changes to any
>>>>> implementations of any of the existing entailment regimes
>>>>> required 4. Those who are ok with canonicalization can do that,
>>>>> and don't need to deal with entailment 5. Those who don't want to
>>>>> canonicalize, have the option of supporting only string value
>>>>> equality at query time, without RDFS- and D-Entailment 6. “MAY
>>>>> canonicalize” softly discourages the use of xsd:string typed
>>>>> literals, without abolishing them outright or declaring them
>>>>> archaic 7. Standardizing on xsd:string was never an option
>>>>> because of language tags 8. Standardizing on rdf:PlainLiteral was
>>>>> never an option because it MUST NOT be used in serializations
>>>>> that support plain literals
>>>>> 
>>>>> 
>>>>> CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal
>>>>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Value
>>>>> 
>>>>> 
>>>>> §1 Rename it to “6.5.1 The Value Corresponding to a Literal” and
>>>>> move it ahead of 6.5.1
>>>>> 
>>>>> §2 Add to the beginning: “The value of a plain literal without
>>>>> language tag is the same Unicode string as its lexical form.
>>>>> 
>>>>> The value of a plain literal with language tag is a pair
>>>>> consisting of 1. the same Unicode string as its lexical form, and
>>>>> 2. its language tag.
>>>>> 
>>>>> For typed literals, …” (continue with rest of section as is)
>>>>> 
>>>>> §3 Remove the Note at the end of the section
>>>>> 
>>>>> 
>>>>> CHANGES TO 6.5.1 Literal Equality
>>>>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality
>>>>> 
>>>>> 
>>>>> §4 Rename section to “6.5.2 Literal Equality and Canonical
>>>>> Forms”
>>>>> 
>>>>> §5 Add to the beginning: “Equality of literals can be evaluated
>>>>> based on their syntax, or based on their value.”
>>>>> 
>>>>> §6 Change “Two literals are equal …” to: “Two literals are
>>>>> syntactically equal …” in the current first paragraph.
>>>>> 
>>>>> §7 Add to the end: “In application contexts, comparing the values
>>>>> of literals (see section 6.5.1) is usually more helpful than
>>>>> comparing their syntactic forms. Literals with different lexical
>>>>> forms and with different datatypes can have the same value. In
>>>>> particular:
>>>>> 
>>>>> - A plain literal with lexical form aaa and no language tag has
>>>>> the same value as a typed literal with lexical form aaa and
>>>>> datatype IRI xsd:string - A plain literal with lexical form aaa
>>>>> and no language tag has the same value as a typed literal with
>>>>> lexical form aaa@ and datatype IRI rdf:PlainLiteral - A plain
>>>>> literal with lexical form aaa and language tag xx has the same
>>>>> value as a typed literal with lexical form aaa@xx and datatype
>>>>> IRI rdf:PlainLiteral”
>>>>> 
>>>>> §8 “Some literals are canonical forms. Implementations MAY
>>>>> replace any literal with a canonical form if both are
>>>>> syntactically different, but have the same value. All plain
>>>>> literals, with or without language tag, are canonical forms.”
>>>>> 
>>>>> 
>>>>> CHANGES TO 6.3 Graph Equivalence
>>>>> http://www.w3.org/TR/rdf-concepts/#section-graph-equality
>>>>> 
>>>>> 
>>>>> §9 Append this leftover sentence, which was removed from 6.5.1:
>>>>> “Note: For comparing RDF Graphs, semantic notions of entailment
>>>>> (see [RDF-SEMANTICS]) are usually more helpful than the syntactic
>>>>> equivalence defined here.”
>>>>> 
>>>>> 
>>>>> EXTENDING THIS TO NUMERIC LITERALS???
>>>>> 
>>>>> (While we're at it, we might also cover equalities between the
>>>>> built-in numeric XSD types, and between different lexical forms
>>>>> of the same built-in XSD datatype.)
>>>> 
>>>> 
>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home:
>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 PGP Key:
>>>> http://www.ivan-herman.net/pgpkey.html FOAF:
>>>> http://www.ivan-herman.net/foaf.rdf
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> -- 
>> Antoine Zimmermann
>> Researcher at:
>> Laboratoire d'InfoRmatique en Image et Systèmes d'information
>> Database Group
>> 7 Avenue Jean Capelle
>> 69621 Villeurbanne Cedex
>> France
>> Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13
>> Lecturer at:
>> Institut National des Sciences Appliquées de Lyon
>> 20 Avenue Albert Einstein
>> 69621 Villeurbanne Cedex
>> France
>> antoine.zimmermann@insa-lyon.fr
>> http://zimmer.aprilfoolsreview.com/
>> 
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 12 May 2011 12:05:31 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:42 GMT