- From: Nathan <nathan@webr3.org>
- Date: Wed, 01 Aug 2012 15:15:03 +0100
- To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
- CC: public-rdf-comments@w3.org, David Booth <david@dbooth.org>
although one use-case (non-Z) can surely also be handled by including a location in the data, is there any usecase where timezone MUST be included? Peter F. Patel-Schneider wrote: > OK, both use cases are acknowledged. > > Given that there is a use case where the time zone is important, how can > suggesting only using Z when timezone is not important be any help to > application writers? > > peter > > On 08/01/2012 10:00 AM, David Booth wrote: >> Yes, preserving the timezone is important in use cases like that, in >> which the xsd:datetime not only represents a point in time, but also >> encodes timezone provenance in the literal form, i.e., it encodes both >> when and *where* (in what timezone) the event occurred. >> >> But in other use cases (such as looking at the sequence of events in a >> medical history) the xsd:datetime is used purely as a point in time. >> Comparisons are vital, and timezone provenance is unimportant (and gets >> in the way). >> >> So yes, there are clearly two different classes of use cases for >> xsd:datetime. But this does not mean that we should throw out the baby >> with the bath. Both use cases can be acknowledged. >> >> David >> >> On Wed, 2012-08-01 at 10:37 +0100, Steve Harris wrote: >>> +1 >>> >>> We have apps that operate in different timezones, so preserving the >>> timezone in results is import. >>> >>> - Steve >>> >>> On 2012-08-01, at 10:07, Andy Seaborne wrote: >>> >>>> The majority of use for RDF for apps I'm involved in at the moment >>> are all in the same time place. >>>> Changing the data as it goes through the system, and hence breaking >>> the display aspect of the data, is a complete non-starter. I want to >>> know what timezone the dateTime started as. >>>> Display is more important than comparison. >>>> >>>> And, from time spent doing support, the first user expectation is >>> that stuff that comes out looks like what went in. Not changing the >>> date part sometimes. >>>> "2012-12-31T22:00:00-05:00"^^xsd:dateTime >>>> >>>> is the same time point as >>>> >>>> "2013-01-01T03:00:00Z"^^xsd:dateTime >>>> >>>> It's a different year. >>>> >>>> Andy >>>> >>>> On 01/08/12 03:57, Peter F. Patel-Schneider wrote: >>>>> On 07/31/2012 10:48 PM, David Booth wrote: >>>>>> On Tue, 2012-07-31 at 16:24 -0400, Peter F. Patel-Schneider wrote: >>>>>>> On 07/31/2012 03:59 PM, David Booth wrote: >>>>>>>> Hi Peter, >>>>>>>> >>>>>>>> On Tue, 2012-07-31 at 15:36 -0400, Peter F. Patel-Schneider wrote: >>>>>>>>> Hmm. >>>>>>>>> >>>>>>>>> Your two examples have different canonical forms in XML. I do >>>>>>>>> not >>>>>>>>> believe >>>>>>>>> that going beyond XML canonicalization is a good idea. >>>>>>>> What downside do you see? >>>>>>> If RDF goes beyond XML canonicalization is it doing something to XML >>>>>>> datatypes >>>>>>> that is not part of the XML specification. This appears to be >>>>>>> driving a >>>>>>> further wedge between RDF and XML data. >>>>>> I guess I'm not following what you mean. For example, the >>>>>> xsd:datetimeStamp datatype already requires a timezoneFrag to be >>>>>> specified, and one permissible timezoneFrag is "Z" (meaning UTC). If >>>>>> RDF canonicalization suggested that the timezoneFrag always be >>>>>> "Z", what >>>>>> wedge would that drive between RDF and XML data? >>>>> It would say that as far as RDF is concerned, XML data that doesn't >>>>> use >>>>> Z is somehow second class. >>>>>>> [...] >>>>>>> >>>>>>>>> In any case, I don't see the point here. If equality-unique >>>>>>>>> canonical forms >>>>>>>>> are only encouraged, then applications will still have to do >>>>>>>>> datatype-aware >>>>>>>>> comparisons. >>>>>>>> Only if they need to handle all possible data serializations. >>>>>>>> If 90% >>>>>>>> of the available datasets use the canonical forms then many apps >>>>>>>> will >>>>>>>> not need to do datatype-aware comparisons, though the ones that >>>>>>>> need to >>>>>>>> cover 100% will. >>>>>>> If even 99.99% of available datasets use the canonical forms then >>>>>>> all >>>>>>> apps >>>>>>> should still be prepared for non-canonical forms. To do >>>>>>> otherwise is >>>>>>> to be >>>>>>> wrong. >>>>>> It would be wrong for *some* apps, but by no means all. You can't >>>>>> paint >>>>>> all apps with the same brush. For example, if there are 100 datasets >>>>>> available, and 100 apps, and 90 of the datasets use the canonical >>>>>> forms, >>>>>> and 40 of the apps only need the datasets that use the canonical >>>>>> forms, >>>>>> then that substantially lowers the implementation barrier for >>>>>> those 40 >>>>>> apps. >>>>> As long as these apps only use the 90, and stay away from the 10. This >>>>> appears to break one of the prime motivations of RDF, that all data >>>>> can >>>>> be used by anyone. >>>>>>> That is not to say that being wrong is not useful on occasion, but I >>>>>>> don't see that there is any good to be had here in the WG suggesting >>>>>>> canonical >>>>>>> forms be used exclusively. >>>>>> I just described some substantial good. I'm not suggesting that >>>>>> canonicalization be used *exclusively*, but merely that it be >>>>>> *encouraged*, because it does significantly simplify processing >>>>>> when it >>>>>> can be used. >>>>> I don't see the "significantly" here at all. >>>>>>>> I think it is important to keep the RDF entry barrier as low as >>>>>>>> possible >>>>>>>> whenever possible, in order to support scruffy apps that are good >>>>>>>> enough >>>>>>>> for many purposes, even if they don't handle every case. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>> It is important that apps should do the right thing. For example, >>>>>>> should apps >>>>>>> ignore character encoding? How hard is doing datatype-aware >>>>>>> processing of >>>>>>> literals, compared with all the rest of the stuff that is >>>>>>> required to >>>>>>> handle RDF? >>>>>> It depends entirely on the application. In the case of xsd:datetime, >>>>>> for example, it means the literal must be completely parsed into its >>>>>> year, month, day, hour, minute, seconds and timezone offset, and then >>>>>> datetime arithmetic -- which is *not* simple -- must be used to >>>>>> properly >>>>>> add the timezone offset in order to compare two values. All this >>>>>> instead of a simple, string comparison! But the worst part is >>>>>> that the >>>>>> application has to *understand* the different datatypes, and this >>>>>> means >>>>>> that the code either has to special case every datatype, or it has to >>>>>> implement some kind of general datatype-handling framework. >>>>>> Suddenly, >>>>>> an app that could have been a one-off, three-line perl script >>>>>> blows up >>>>>> into something that requires significantly more development effort. >>>>>> >>>>>> The RDF model is so simple. It would be nice if it could be >>>>>> processed >>>>>> very simply whenever possible. "Make the simple cases simple", etc. >>>>> The simplicity of the RDF model is, in my mind, tied up with its >>>>> uniformity. Your proposal severely breaks that uniformity, which is a >>>>> major lossage. >>>>>>> peter >>>>>>> >>>>>>> PS: Yes, I do use text processors to handle RDF, and quite >>>>>>> often, even >>>>>>> analysing the 2011 Billion Triple Challenge triples using sed and >>>>>>> grep. >>>>>>> However, I check to ensure that the right thing happens. >>>>>> Right, that's exactly the kind of simplified processing that I >>>>>> think we >>>>>> should facilitate as often as possible. >>>>>> >>>>>> >>>>> Sure, as long as it is only in one-off hacks, controlled by >>>>> experts, who >>>>> can adjust the processing according to the peculiarities of the input. >>>>> As soon as direct expert control goes away, then the app needs to be >>>>> able to consume all RDF, which I see as counter to your proposal. >>>>> >>>>> peter >>>>> >>>>> >>>>> >>>>> > > > >
Received on Wednesday, 1 August 2012 14:16:23 UTC