- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Thu, 12 May 2011 14:27:06 +0100
- To: Ivan Herman <ivan@w3.org>
- Cc: antoine.zimmermann@insa-lyon.fr, public-rdf-wg@w3.org
On 12 May 2011, at 13:06, Ivan Herman wrote: >> I'd be tempted to go further and make only the primitive types such as xsd:decimal into RDF canonical forms. This would mean that systems MAY canonicalize all numbers to a single numeric datatype. > > Do you mean like the 'canonical' forms in Turtle? I may miss something here. No. Turtle has syntactic sugar for certain numeric literals; this has nothing to do with canonicalization. (This all goes way beyond ISSUE-12 anyways...) I was suggesting that perhaps, instead of this: "+0013"^^xsd:byte => "13"^^xsd:byte I'd like to say that implementations MAY do this: "+0013"^^xsd:byte => "13.0"^^xsd:decimal They'd end up with all numbers represented in a single data type, with a single canonical representation. This makes comparisons quite a bit easier. Best, Richard > > Ivan > > > >> Best, >> Richard >> >> >> >>> >>> Le 12/05/2011 12:19, Richard Cyganiak a écrit : >>>> On 12 May 2011, at 09:22, Ivan Herman wrote: >>>>> - You make the remark on the wiki page on 'extending this to >>>>> numeric literals', which I would rather say 'extending this to any >>>>> datatype' (eg, xsd:dateTime, too). >>>> >>>> Right -- I changed the section heading on the wiki. >>>> >>>>> I have the impression that this is also a consequence of what you >>>>> write already. You emphasize the 'lexical equality', and you also >>>>> say "Implementations MAY replace any literal with a canonical form >>>>> if both are syntactically different, but have the same value." >>>>> which does not look like being bound to string literals. >>>> >>>> The way I wrote it, the only literals marked as canonical forms are >>>> plain string literals. So the sentence doesn't license replacement >>>> of, say, +00013 with 13, because no numeric literals have been marked >>>> as canonical forms. That could be easily changed, of course. >>>> >>>>> Do you think there is anything missing in this document to make >>>>> that picture complete (except, editorially, to possibly add >>>>> non-string examples)? >>>> >>>> If we only want to address string literals, then I think the proposal >>>> is complete. >>>> >>>> If we want to address other XSD literals as well, then some bullet >>>> points should be added to the list of equalities, and the canonical >>>> lexical form of some XSD datatypes (e.g., "13.0"^^xsd:decimal) should >>>> be defined to be canonical forms so that other same-valued literals >>>> can be replaced with the canonical form. This requires a detailed >>>> reading of the XSD spec (which I have not done so far). >>>> >>>> (RDF Concepts should probably contain a paragraph or two introducing >>>> the rdf:PlainLiteral datatype and referencing the relevant spec, but >>>> let's treat that as a separate issue.) >>>> >>>>> - I would also propose to make some tiny changes in the Semantics >>>>> document. >>>> >>>> I'll let the editors of that document comment. >>>> >>>> Best, Richard >>>> >>>> >>>>> >>>>> Ivan >>>>> >>>>> >>>>> On May 11, 2011, at 23:23 , Richard Cyganiak wrote: >>>>> >>>>>> I took an action today to draft text for RDF Concepts that >>>>>> resolves ISSUE-12. I put it on the wiki here: >>>>>> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal >>>>>> >>>>>> >>> A plain text copy is attached below. >>>>>> >>>>>> Best, Richard >>>>>> >>>>>> >>>>>> >>>>>> SHORT SUMMARY >>>>>> >>>>>> 1. RDF Concepts puts more emphasis on the distinction between >>>>>> (syntactic) “literal equality” and (semantic, important for >>>>>> applications) “value equality” 2. RDF Concepts explicitly points >>>>>> out the specific string value equalities that already arise from >>>>>> RDF Semantics 3. RDF Concepts declares one of the string literal >>>>>> forms as canonical 4. Implementations MAY canonicalize, but don't >>>>>> have to 5. The canonical form is plain literals. >>>>>> >>>>>> >>>>>> WHY? >>>>>> >>>>>> 1. No changes to the abstract syntax required 2. No changes to >>>>>> any concrete syntax or parser required 3. No changes to any >>>>>> implementations of any of the existing entailment regimes >>>>>> required 4. Those who are ok with canonicalization can do that, >>>>>> and don't need to deal with entailment 5. Those who don't want to >>>>>> canonicalize, have the option of supporting only string value >>>>>> equality at query time, without RDFS- and D-Entailment 6. “MAY >>>>>> canonicalize” softly discourages the use of xsd:string typed >>>>>> literals, without abolishing them outright or declaring them >>>>>> archaic 7. Standardizing on xsd:string was never an option >>>>>> because of language tags 8. Standardizing on rdf:PlainLiteral was >>>>>> never an option because it MUST NOT be used in serializations >>>>>> that support plain literals >>>>>> >>>>>> >>>>>> CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal >>>>>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Value >>>>>> >>>>>> >>>>>> §1 Rename it to “6.5.1 The Value Corresponding to a Literal” and >>>>>> move it ahead of 6.5.1 >>>>>> >>>>>> §2 Add to the beginning: “The value of a plain literal without >>>>>> language tag is the same Unicode string as its lexical form. >>>>>> >>>>>> The value of a plain literal with language tag is a pair >>>>>> consisting of 1. the same Unicode string as its lexical form, and >>>>>> 2. its language tag. >>>>>> >>>>>> For typed literals, …” (continue with rest of section as is) >>>>>> >>>>>> §3 Remove the Note at the end of the section >>>>>> >>>>>> >>>>>> CHANGES TO 6.5.1 Literal Equality >>>>>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality >>>>>> >>>>>> >>>>>> §4 Rename section to “6.5.2 Literal Equality and Canonical >>>>>> Forms” >>>>>> >>>>>> §5 Add to the beginning: “Equality of literals can be evaluated >>>>>> based on their syntax, or based on their value.” >>>>>> >>>>>> §6 Change “Two literals are equal …” to: “Two literals are >>>>>> syntactically equal …” in the current first paragraph. >>>>>> >>>>>> §7 Add to the end: “In application contexts, comparing the values >>>>>> of literals (see section 6.5.1) is usually more helpful than >>>>>> comparing their syntactic forms. Literals with different lexical >>>>>> forms and with different datatypes can have the same value. In >>>>>> particular: >>>>>> >>>>>> - A plain literal with lexical form aaa and no language tag has >>>>>> the same value as a typed literal with lexical form aaa and >>>>>> datatype IRI xsd:string - A plain literal with lexical form aaa >>>>>> and no language tag has the same value as a typed literal with >>>>>> lexical form aaa@ and datatype IRI rdf:PlainLiteral - A plain >>>>>> literal with lexical form aaa and language tag xx has the same >>>>>> value as a typed literal with lexical form aaa@xx and datatype >>>>>> IRI rdf:PlainLiteral” >>>>>> >>>>>> §8 “Some literals are canonical forms. Implementations MAY >>>>>> replace any literal with a canonical form if both are >>>>>> syntactically different, but have the same value. All plain >>>>>> literals, with or without language tag, are canonical forms.” >>>>>> >>>>>> >>>>>> CHANGES TO 6.3 Graph Equivalence >>>>>> http://www.w3.org/TR/rdf-concepts/#section-graph-equality >>>>>> >>>>>> >>>>>> §9 Append this leftover sentence, which was removed from 6.5.1: >>>>>> “Note: For comparing RDF Graphs, semantic notions of entailment >>>>>> (see [RDF-SEMANTICS]) are usually more helpful than the syntactic >>>>>> equivalence defined here.” >>>>>> >>>>>> >>>>>> EXTENDING THIS TO NUMERIC LITERALS??? >>>>>> >>>>>> (While we're at it, we might also cover equalities between the >>>>>> built-in numeric XSD types, and between different lexical forms >>>>>> of the same built-in XSD datatype.) >>>>> >>>>> >>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 PGP Key: >>>>> http://www.ivan-herman.net/pgpkey.html FOAF: >>>>> http://www.ivan-herman.net/foaf.rdf >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>> Antoine Zimmermann >>> Researcher at: >>> Laboratoire d'InfoRmatique en Image et Systèmes d'information >>> Database Group >>> 7 Avenue Jean Capelle >>> 69621 Villeurbanne Cedex >>> France >>> Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13 >>> Lecturer at: >>> Institut National des Sciences Appliquées de Lyon >>> 20 Avenue Albert Einstein >>> 69621 Villeurbanne Cedex >>> France >>> antoine.zimmermann@insa-lyon.fr >>> http://zimmer.aprilfoolsreview.com/ >>> >> >> > > > ---- > Ivan Herman, W3C Semantic Web Activity Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > PGP Key: http://www.ivan-herman.net/pgpkey.html > FOAF: http://www.ivan-herman.net/foaf.rdf > > > > > >
Received on Thursday, 12 May 2011 14:46:24 UTC