- From: David Booth <david@dbooth.org>
- Date: Wed, 01 Aug 2012 15:12:30 -0400
- To: Gavin Carothers <gavin@carothers.name>
- Cc: public-rdf-comments <public-rdf-comments@w3.org>
Hi Gavin, On Tue, 2012-07-31 at 12:57 -0700, Gavin Carothers wrote: > On Tue, Jul 31, 2012 at 11:31 AM, David Booth <david@dbooth.org> wrote: [ . . . ] > > A particular case in point: xsd:datetime. > > > > "2012-07-31T17:16:00+01:00"^^xsd:dateTime > > > > represents the same point in time as > > > > "2012-07-31T16:16:00Z"^^xsd:dateTime > > No, it doesn't. This is a common misunderstanding regarding date > times. The time zone is NOT a meaningless value. xsd:dateTime happily > gets this right in the timezoneCanonicalFragmentMap > http://www.w3.org/TR/xmlschema11-2/#f-tzCanFragMap Can you explain? I just tested the above example using the Perl DateTime::Format::XSD library (to be sure I hadn't made a silly typo), and it says that they represent the exact same point in time. If you think that library is wrong, I'd like to know why. > > > > > but the strings are not the same. This could be avoided by encouraging > > a canonical serialization such as dateTimeStamp > > http://www.w3.org/TR/xmlschema11-2/#dateTimeStamp > > in which the timezoneFrag is required to be "Z". (I've just filed a > > bugzilla report on XML Datatypes to ask for such a canonicalization > > https://www.w3.org/Bugs/Public/show_bug.cgi?id=18452 > > because there doesn't seem to be one defined currently.) > > > > How forcefully such canonicalization should be encouraged is a matter > > for debate. I do not think it should be a "MUST". "SHOULD" would be > > fine, as there are good reasons why someone may want to generate > > non-canonical literals. But it may also be good enough to just put an > > editorial note in the spec saying that "RDF generators are encouraged to > > generate literals in a standard, canonical form that allows simple > > string comparison to test for equality and greater-than/less-than when > > possible". > > I would object to either MUST or SHOULD. In may systems preserving the > original lexical form is an important feature. I agree that preserving the lexical form is important for many applications, and those should not perform canonicalization. The RFC2119 definition of "SHOULD" specifically allows deviation for good reason: http://www.ietf.org/rfc/rfc2119.txt [[ 3. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. ]] Given this definition, why do you think "SHOULD" would be too strong? > RDF does this well > today and clearly defines lexical space as separate from value space. > > The current working group direction is try and specify a canonical > serialization of both a single triple and possibly of a graph as > specific form of N-Triples. Excellent! I was not aware of this, but I strongly support the idea. > Cononicalization doesn't stop with just > datatypes. Agreed. Datatypes just seemed like the most obvious place to start. > This should serve the use cases that require > canonicalization well. If there is a specific use case the current WG > direction won't serve please send it along. Okay. -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer.
Received on Wednesday, 1 August 2012 19:13:04 UTC