W3C home > Mailing lists > Public > public-rdf-comments@w3.org > July 2012

Encouraging canonical serializations of datatypes in RDF

From: David Booth <david@dbooth.org>
Date: Tue, 31 Jul 2012 14:31:41 -0400
To: public-rdf-comments <public-rdf-comments@w3.org>
Message-ID: <1343759501.2725.76105.camel@dbooth-laptop>
To enable RDF from one system to be more easily compared with RDF from
another system, it would be helpful if the serialization of datatyped
literals were encouraged to be in a canonical form that would enable
simple string comparison to be used instead of requiring a comparator
that understands the semantics of each datatype.

A particular case in point: xsd:datetime.   

  "2012-07-31T17:16:00+01:00"^^xsd:dateTime

represents the same point in time as

  "2012-07-31T16:16:00Z"^^xsd:dateTime

but the strings are not the same.  This could be avoided by encouraging
a canonical serialization such as dateTimeStamp
http://www.w3.org/TR/xmlschema11-2/#dateTimeStamp
in which the timezoneFrag is required to be "Z".  (I've just filed a
bugzilla report on XML Datatypes to ask for such a canonicalization
https://www.w3.org/Bugs/Public/show_bug.cgi?id=18452 
because there doesn't seem to be one defined currently.)

How forcefully such canonicalization should be encouraged is a matter
for debate.  I do not think it should be a "MUST".  "SHOULD" would be
fine, as there are good reasons why someone may want to generate
non-canonical literals.  But it may also be good enough to just put an
editorial note in the spec saying that "RDF generators are encouraged to
generate literals in a standard, canonical form that allows simple
string comparison to test for equality and greater-than/less-than when
possible".  


-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Tuesday, 31 July 2012 18:32:14 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:59:30 UTC