- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Thu, 12 May 2011 12:52:51 +0100
- To: antoine.zimmermann@insa-lyon.fr
- Cc: public-rdf-wg@w3.org
On 12 May 2011, at 11:36, Antoine Zimmermann wrote: > The XSD specifications define the canonical form of xsd:boolean, xsd:decimal, xsd:float, xsd:double, xsd:dateTime, xsd:time, xsd:date, xsd:hexBinary, xsd:integer, xsd:nonPositiveInteger, xsd:negativeInteger, xsd:long, xsd:int, xsd:short, xsd:byte, xsd:nonNegativeInteger, xsd:unsignedLong, xsd:unsignedInt, xsd:unsignedShort, xsd:unsignedByte, xsd:positiveInteger. > > RDF could simply rely on these definitions. +1. But I'd be tempted to go further and make only the primitive types such as xsd:decimal into RDF canonical forms. This would mean that systems MAY canonicalize all numbers to a single numeric datatype. Best, Richard > > Le 12/05/2011 12:19, Richard Cyganiak a écrit : >> On 12 May 2011, at 09:22, Ivan Herman wrote: >>> - You make the remark on the wiki page on 'extending this to >>> numeric literals', which I would rather say 'extending this to any >>> datatype' (eg, xsd:dateTime, too). >> >> Right -- I changed the section heading on the wiki. >> >>> I have the impression that this is also a consequence of what you >>> write already. You emphasize the 'lexical equality', and you also >>> say "Implementations MAY replace any literal with a canonical form >>> if both are syntactically different, but have the same value." >>> which does not look like being bound to string literals. >> >> The way I wrote it, the only literals marked as canonical forms are >> plain string literals. So the sentence doesn't license replacement >> of, say, +00013 with 13, because no numeric literals have been marked >> as canonical forms. That could be easily changed, of course. >> >>> Do you think there is anything missing in this document to make >>> that picture complete (except, editorially, to possibly add >>> non-string examples)? >> >> If we only want to address string literals, then I think the proposal >> is complete. >> >> If we want to address other XSD literals as well, then some bullet >> points should be added to the list of equalities, and the canonical >> lexical form of some XSD datatypes (e.g., "13.0"^^xsd:decimal) should >> be defined to be canonical forms so that other same-valued literals >> can be replaced with the canonical form. This requires a detailed >> reading of the XSD spec (which I have not done so far). >> >> (RDF Concepts should probably contain a paragraph or two introducing >> the rdf:PlainLiteral datatype and referencing the relevant spec, but >> let's treat that as a separate issue.) >> >>> - I would also propose to make some tiny changes in the Semantics >>> document. >> >> I'll let the editors of that document comment. >> >> Best, Richard >> >> >>> >>> Ivan >>> >>> >>> On May 11, 2011, at 23:23 , Richard Cyganiak wrote: >>> >>>> I took an action today to draft text for RDF Concepts that >>>> resolves ISSUE-12. I put it on the wiki here: >>>> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal >>>> >>>> > A plain text copy is attached below. >>>> >>>> Best, Richard >>>> >>>> >>>> >>>> SHORT SUMMARY >>>> >>>> 1. RDF Concepts puts more emphasis on the distinction between >>>> (syntactic) “literal equality” and (semantic, important for >>>> applications) “value equality” 2. RDF Concepts explicitly points >>>> out the specific string value equalities that already arise from >>>> RDF Semantics 3. RDF Concepts declares one of the string literal >>>> forms as canonical 4. Implementations MAY canonicalize, but don't >>>> have to 5. The canonical form is plain literals. >>>> >>>> >>>> WHY? >>>> >>>> 1. No changes to the abstract syntax required 2. No changes to >>>> any concrete syntax or parser required 3. No changes to any >>>> implementations of any of the existing entailment regimes >>>> required 4. Those who are ok with canonicalization can do that, >>>> and don't need to deal with entailment 5. Those who don't want to >>>> canonicalize, have the option of supporting only string value >>>> equality at query time, without RDFS- and D-Entailment 6. “MAY >>>> canonicalize” softly discourages the use of xsd:string typed >>>> literals, without abolishing them outright or declaring them >>>> archaic 7. Standardizing on xsd:string was never an option >>>> because of language tags 8. Standardizing on rdf:PlainLiteral was >>>> never an option because it MUST NOT be used in serializations >>>> that support plain literals >>>> >>>> >>>> CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal >>>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Value >>>> >>>> >>>> §1 Rename it to “6.5.1 The Value Corresponding to a Literal” and >>>> move it ahead of 6.5.1 >>>> >>>> §2 Add to the beginning: “The value of a plain literal without >>>> language tag is the same Unicode string as its lexical form. >>>> >>>> The value of a plain literal with language tag is a pair >>>> consisting of 1. the same Unicode string as its lexical form, and >>>> 2. its language tag. >>>> >>>> For typed literals, …” (continue with rest of section as is) >>>> >>>> §3 Remove the Note at the end of the section >>>> >>>> >>>> CHANGES TO 6.5.1 Literal Equality >>>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality >>>> >>>> >>>> §4 Rename section to “6.5.2 Literal Equality and Canonical >>>> Forms” >>>> >>>> §5 Add to the beginning: “Equality of literals can be evaluated >>>> based on their syntax, or based on their value.” >>>> >>>> §6 Change “Two literals are equal …” to: “Two literals are >>>> syntactically equal …” in the current first paragraph. >>>> >>>> §7 Add to the end: “In application contexts, comparing the values >>>> of literals (see section 6.5.1) is usually more helpful than >>>> comparing their syntactic forms. Literals with different lexical >>>> forms and with different datatypes can have the same value. In >>>> particular: >>>> >>>> - A plain literal with lexical form aaa and no language tag has >>>> the same value as a typed literal with lexical form aaa and >>>> datatype IRI xsd:string - A plain literal with lexical form aaa >>>> and no language tag has the same value as a typed literal with >>>> lexical form aaa@ and datatype IRI rdf:PlainLiteral - A plain >>>> literal with lexical form aaa and language tag xx has the same >>>> value as a typed literal with lexical form aaa@xx and datatype >>>> IRI rdf:PlainLiteral” >>>> >>>> §8 “Some literals are canonical forms. Implementations MAY >>>> replace any literal with a canonical form if both are >>>> syntactically different, but have the same value. All plain >>>> literals, with or without language tag, are canonical forms.” >>>> >>>> >>>> CHANGES TO 6.3 Graph Equivalence >>>> http://www.w3.org/TR/rdf-concepts/#section-graph-equality >>>> >>>> >>>> §9 Append this leftover sentence, which was removed from 6.5.1: >>>> “Note: For comparing RDF Graphs, semantic notions of entailment >>>> (see [RDF-SEMANTICS]) are usually more helpful than the syntactic >>>> equivalence defined here.” >>>> >>>> >>>> EXTENDING THIS TO NUMERIC LITERALS??? >>>> >>>> (While we're at it, we might also cover equalities between the >>>> built-in numeric XSD types, and between different lexical forms >>>> of the same built-in XSD datatype.) >>> >>> >>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 PGP Key: >>> http://www.ivan-herman.net/pgpkey.html FOAF: >>> http://www.ivan-herman.net/foaf.rdf >>> >>> >>> >>> >>> >>> >> >> > > > -- > Antoine Zimmermann > Researcher at: > Laboratoire d'InfoRmatique en Image et Systèmes d'information > Database Group > 7 Avenue Jean Capelle > 69621 Villeurbanne Cedex > France > Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13 > Lecturer at: > Institut National des Sciences Appliquées de Lyon > 20 Avenue Albert Einstein > 69621 Villeurbanne Cedex > France > antoine.zimmermann@insa-lyon.fr > http://zimmer.aprilfoolsreview.com/ >
Received on Thursday, 12 May 2011 11:53:23 UTC