- From: Ivan Herman <ivan@w3.org>
- Date: Thu, 12 May 2011 17:49:37 +0200
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: antoine.zimmermann@insa-lyon.fr, public-rdf-wg@w3.org
On May 12, 2011, at 15:27 , Richard Cyganiak wrote: > On 12 May 2011, at 13:06, Ivan Herman wrote: >>> I'd be tempted to go further and make only the primitive types such as xsd:decimal into RDF canonical forms. This would mean that systems MAY canonicalize all numbers to a single numeric datatype. >> >> Do you mean like the 'canonical' forms in Turtle? I may miss something here. > > No. Turtle has syntactic sugar for certain numeric literals; this has nothing to do with canonicalization. > > (This all goes way beyond ISSUE-12 anyways...) > > I was suggesting that perhaps, instead of this: > "+0013"^^xsd:byte => "13"^^xsd:byte > > I'd like to say that implementations MAY do this: > "+0013"^^xsd:byte => "13.0"^^xsd:decimal > I have not made up my mind on this, just thinking out 'loud': in many programming environment I would like to have access to the fact that something is a byte and not a decimal because the implementation of the latter might be way more complex and slow than the former. In other words, I am not sure RDF should be too 'smart' about it. If the user decided to define something as a byte, we should keep it as a byte... Ivan > They'd end up with all numbers represented in a single data type, with a single canonical representation. This makes comparisons quite a bit easier. > > Best, > Richard > > > > >> >> Ivan >> >> >> >>> Best, >>> Richard >>> >>> >>> >>>> >>>> Le 12/05/2011 12:19, Richard Cyganiak a écrit : >>>>> On 12 May 2011, at 09:22, Ivan Herman wrote: >>>>>> - You make the remark on the wiki page on 'extending this to >>>>>> numeric literals', which I would rather say 'extending this to any >>>>>> datatype' (eg, xsd:dateTime, too). >>>>> >>>>> Right -- I changed the section heading on the wiki. >>>>> >>>>>> I have the impression that this is also a consequence of what you >>>>>> write already. You emphasize the 'lexical equality', and you also >>>>>> say "Implementations MAY replace any literal with a canonical form >>>>>> if both are syntactically different, but have the same value." >>>>>> which does not look like being bound to string literals. >>>>> >>>>> The way I wrote it, the only literals marked as canonical forms are >>>>> plain string literals. So the sentence doesn't license replacement >>>>> of, say, +00013 with 13, because no numeric literals have been marked >>>>> as canonical forms. That could be easily changed, of course. >>>>> >>>>>> Do you think there is anything missing in this document to make >>>>>> that picture complete (except, editorially, to possibly add >>>>>> non-string examples)? >>>>> >>>>> If we only want to address string literals, then I think the proposal >>>>> is complete. >>>>> >>>>> If we want to address other XSD literals as well, then some bullet >>>>> points should be added to the list of equalities, and the canonical >>>>> lexical form of some XSD datatypes (e.g., "13.0"^^xsd:decimal) should >>>>> be defined to be canonical forms so that other same-valued literals >>>>> can be replaced with the canonical form. This requires a detailed >>>>> reading of the XSD spec (which I have not done so far). >>>>> >>>>> (RDF Concepts should probably contain a paragraph or two introducing >>>>> the rdf:PlainLiteral datatype and referencing the relevant spec, but >>>>> let's treat that as a separate issue.) >>>>> >>>>>> - I would also propose to make some tiny changes in the Semantics >>>>>> document. >>>>> >>>>> I'll let the editors of that document comment. >>>>> >>>>> Best, Richard >>>>> >>>>> >>>>>> >>>>>> Ivan >>>>>> >>>>>> >>>>>> On May 11, 2011, at 23:23 , Richard Cyganiak wrote: >>>>>> >>>>>>> I took an action today to draft text for RDF Concepts that >>>>>>> resolves ISSUE-12. I put it on the wiki here: >>>>>>> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal >>>>>>> >>>>>>> >>>> A plain text copy is attached below. >>>>>>> >>>>>>> Best, Richard >>>>>>> >>>>>>> >>>>>>> >>>>>>> SHORT SUMMARY >>>>>>> >>>>>>> 1. RDF Concepts puts more emphasis on the distinction between >>>>>>> (syntactic) “literal equality” and (semantic, important for >>>>>>> applications) “value equality” 2. RDF Concepts explicitly points >>>>>>> out the specific string value equalities that already arise from >>>>>>> RDF Semantics 3. RDF Concepts declares one of the string literal >>>>>>> forms as canonical 4. Implementations MAY canonicalize, but don't >>>>>>> have to 5. The canonical form is plain literals. >>>>>>> >>>>>>> >>>>>>> WHY? >>>>>>> >>>>>>> 1. No changes to the abstract syntax required 2. No changes to >>>>>>> any concrete syntax or parser required 3. No changes to any >>>>>>> implementations of any of the existing entailment regimes >>>>>>> required 4. Those who are ok with canonicalization can do that, >>>>>>> and don't need to deal with entailment 5. Those who don't want to >>>>>>> canonicalize, have the option of supporting only string value >>>>>>> equality at query time, without RDFS- and D-Entailment 6. “MAY >>>>>>> canonicalize” softly discourages the use of xsd:string typed >>>>>>> literals, without abolishing them outright or declaring them >>>>>>> archaic 7. Standardizing on xsd:string was never an option >>>>>>> because of language tags 8. Standardizing on rdf:PlainLiteral was >>>>>>> never an option because it MUST NOT be used in serializations >>>>>>> that support plain literals >>>>>>> >>>>>>> >>>>>>> CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal >>>>>>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Value >>>>>>> >>>>>>> >>>>>>> §1 Rename it to “6.5.1 The Value Corresponding to a Literal” and >>>>>>> move it ahead of 6.5.1 >>>>>>> >>>>>>> §2 Add to the beginning: “The value of a plain literal without >>>>>>> language tag is the same Unicode string as its lexical form. >>>>>>> >>>>>>> The value of a plain literal with language tag is a pair >>>>>>> consisting of 1. the same Unicode string as its lexical form, and >>>>>>> 2. its language tag. >>>>>>> >>>>>>> For typed literals, …” (continue with rest of section as is) >>>>>>> >>>>>>> §3 Remove the Note at the end of the section >>>>>>> >>>>>>> >>>>>>> CHANGES TO 6.5.1 Literal Equality >>>>>>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality >>>>>>> >>>>>>> >>>>>>> §4 Rename section to “6.5.2 Literal Equality and Canonical >>>>>>> Forms” >>>>>>> >>>>>>> §5 Add to the beginning: “Equality of literals can be evaluated >>>>>>> based on their syntax, or based on their value.” >>>>>>> >>>>>>> §6 Change “Two literals are equal …” to: “Two literals are >>>>>>> syntactically equal …” in the current first paragraph. >>>>>>> >>>>>>> §7 Add to the end: “In application contexts, comparing the values >>>>>>> of literals (see section 6.5.1) is usually more helpful than >>>>>>> comparing their syntactic forms. Literals with different lexical >>>>>>> forms and with different datatypes can have the same value. In >>>>>>> particular: >>>>>>> >>>>>>> - A plain literal with lexical form aaa and no language tag has >>>>>>> the same value as a typed literal with lexical form aaa and >>>>>>> datatype IRI xsd:string - A plain literal with lexical form aaa >>>>>>> and no language tag has the same value as a typed literal with >>>>>>> lexical form aaa@ and datatype IRI rdf:PlainLiteral - A plain >>>>>>> literal with lexical form aaa and language tag xx has the same >>>>>>> value as a typed literal with lexical form aaa@xx and datatype >>>>>>> IRI rdf:PlainLiteral” >>>>>>> >>>>>>> §8 “Some literals are canonical forms. Implementations MAY >>>>>>> replace any literal with a canonical form if both are >>>>>>> syntactically different, but have the same value. All plain >>>>>>> literals, with or without language tag, are canonical forms.” >>>>>>> >>>>>>> >>>>>>> CHANGES TO 6.3 Graph Equivalence >>>>>>> http://www.w3.org/TR/rdf-concepts/#section-graph-equality >>>>>>> >>>>>>> >>>>>>> §9 Append this leftover sentence, which was removed from 6.5.1: >>>>>>> “Note: For comparing RDF Graphs, semantic notions of entailment >>>>>>> (see [RDF-SEMANTICS]) are usually more helpful than the syntactic >>>>>>> equivalence defined here.” >>>>>>> >>>>>>> >>>>>>> EXTENDING THIS TO NUMERIC LITERALS??? >>>>>>> >>>>>>> (While we're at it, we might also cover equalities between the >>>>>>> built-in numeric XSD types, and between different lexical forms >>>>>>> of the same built-in XSD datatype.) >>>>>> >>>>>> >>>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 PGP Key: >>>>>> http://www.ivan-herman.net/pgpkey.html FOAF: >>>>>> http://www.ivan-herman.net/foaf.rdf >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Antoine Zimmermann >>>> Researcher at: >>>> Laboratoire d'InfoRmatique en Image et Systèmes d'information >>>> Database Group >>>> 7 Avenue Jean Capelle >>>> 69621 Villeurbanne Cedex >>>> France >>>> Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13 >>>> Lecturer at: >>>> Institut National des Sciences Appliquées de Lyon >>>> 20 Avenue Albert Einstein >>>> 69621 Villeurbanne Cedex >>>> France >>>> antoine.zimmermann@insa-lyon.fr >>>> http://zimmer.aprilfoolsreview.com/ >>>> >>> >>> >> >> >> ---- >> Ivan Herman, W3C Semantic Web Activity Lead >> Home: http://www.w3.org/People/Ivan/ >> mobile: +31-641044153 >> PGP Key: http://www.ivan-herman.net/pgpkey.html >> FOAF: http://www.ivan-herman.net/foaf.rdf >> >> >> >> >> >> > > ---- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 PGP Key: http://www.ivan-herman.net/pgpkey.html FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 12 May 2011 15:50:29 UTC