- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Wed, 01 Aug 2012 10:10:25 -0400
- To: public-rdf-comments@w3.org
- CC: David Booth <david@dbooth.org>
OK, both use cases are acknowledged. Given that there is a use case where the time zone is important, how can suggesting only using Z when timezone is not important be any help to application writers? peter On 08/01/2012 10:00 AM, David Booth wrote: > Yes, preserving the timezone is important in use cases like that, in > which the xsd:datetime not only represents a point in time, but also > encodes timezone provenance in the literal form, i.e., it encodes both > when and *where* (in what timezone) the event occurred. > > But in other use cases (such as looking at the sequence of events in a > medical history) the xsd:datetime is used purely as a point in time. > Comparisons are vital, and timezone provenance is unimportant (and gets > in the way). > > So yes, there are clearly two different classes of use cases for > xsd:datetime. But this does not mean that we should throw out the baby > with the bath. Both use cases can be acknowledged. > > David > > On Wed, 2012-08-01 at 10:37 +0100, Steve Harris wrote: >> +1 >> >> We have apps that operate in different timezones, so preserving the >> timezone in results is import. >> >> - Steve >> >> On 2012-08-01, at 10:07, Andy Seaborne wrote: >> >>> The majority of use for RDF for apps I'm involved in at the moment >> are all in the same time place. >>> Changing the data as it goes through the system, and hence breaking >> the display aspect of the data, is a complete non-starter. I want to >> know what timezone the dateTime started as. >>> Display is more important than comparison. >>> >>> And, from time spent doing support, the first user expectation is >> that stuff that comes out looks like what went in. Not changing the >> date part sometimes. >>> "2012-12-31T22:00:00-05:00"^^xsd:dateTime >>> >>> is the same time point as >>> >>> "2013-01-01T03:00:00Z"^^xsd:dateTime >>> >>> It's a different year. >>> >>> Andy >>> >>> On 01/08/12 03:57, Peter F. Patel-Schneider wrote: >>>> On 07/31/2012 10:48 PM, David Booth wrote: >>>>> On Tue, 2012-07-31 at 16:24 -0400, Peter F. Patel-Schneider wrote: >>>>>> On 07/31/2012 03:59 PM, David Booth wrote: >>>>>>> Hi Peter, >>>>>>> >>>>>>> On Tue, 2012-07-31 at 15:36 -0400, Peter F. Patel-Schneider wrote: >>>>>>>> Hmm. >>>>>>>> >>>>>>>> Your two examples have different canonical forms in XML. I do not >>>>>>>> believe >>>>>>>> that going beyond XML canonicalization is a good idea. >>>>>>> What downside do you see? >>>>>> If RDF goes beyond XML canonicalization is it doing something to XML >>>>>> datatypes >>>>>> that is not part of the XML specification. This appears to be >>>>>> driving a >>>>>> further wedge between RDF and XML data. >>>>> I guess I'm not following what you mean. For example, the >>>>> xsd:datetimeStamp datatype already requires a timezoneFrag to be >>>>> specified, and one permissible timezoneFrag is "Z" (meaning UTC). If >>>>> RDF canonicalization suggested that the timezoneFrag always be "Z", what >>>>> wedge would that drive between RDF and XML data? >>>> It would say that as far as RDF is concerned, XML data that doesn't use >>>> Z is somehow second class. >>>>>> [...] >>>>>> >>>>>>>> In any case, I don't see the point here. If equality-unique >>>>>>>> canonical forms >>>>>>>> are only encouraged, then applications will still have to do >>>>>>>> datatype-aware >>>>>>>> comparisons. >>>>>>> Only if they need to handle all possible data serializations. If 90% >>>>>>> of the available datasets use the canonical forms then many apps will >>>>>>> not need to do datatype-aware comparisons, though the ones that need to >>>>>>> cover 100% will. >>>>>> If even 99.99% of available datasets use the canonical forms then all >>>>>> apps >>>>>> should still be prepared for non-canonical forms. To do otherwise is >>>>>> to be >>>>>> wrong. >>>>> It would be wrong for *some* apps, but by no means all. You can't paint >>>>> all apps with the same brush. For example, if there are 100 datasets >>>>> available, and 100 apps, and 90 of the datasets use the canonical forms, >>>>> and 40 of the apps only need the datasets that use the canonical forms, >>>>> then that substantially lowers the implementation barrier for those 40 >>>>> apps. >>>> As long as these apps only use the 90, and stay away from the 10. This >>>> appears to break one of the prime motivations of RDF, that all data can >>>> be used by anyone. >>>>>> That is not to say that being wrong is not useful on occasion, but I >>>>>> don't see that there is any good to be had here in the WG suggesting >>>>>> canonical >>>>>> forms be used exclusively. >>>>> I just described some substantial good. I'm not suggesting that >>>>> canonicalization be used *exclusively*, but merely that it be >>>>> *encouraged*, because it does significantly simplify processing when it >>>>> can be used. >>>> I don't see the "significantly" here at all. >>>>>>> I think it is important to keep the RDF entry barrier as low as >>>>>>> possible >>>>>>> whenever possible, in order to support scruffy apps that are good >>>>>>> enough >>>>>>> for many purposes, even if they don't handle every case. >>>>>>> >>>>>>> David >>>>>>> >>>>>> It is important that apps should do the right thing. For example, >>>>>> should apps >>>>>> ignore character encoding? How hard is doing datatype-aware >>>>>> processing of >>>>>> literals, compared with all the rest of the stuff that is required to >>>>>> handle RDF? >>>>> It depends entirely on the application. In the case of xsd:datetime, >>>>> for example, it means the literal must be completely parsed into its >>>>> year, month, day, hour, minute, seconds and timezone offset, and then >>>>> datetime arithmetic -- which is *not* simple -- must be used to properly >>>>> add the timezone offset in order to compare two values. All this >>>>> instead of a simple, string comparison! But the worst part is that the >>>>> application has to *understand* the different datatypes, and this means >>>>> that the code either has to special case every datatype, or it has to >>>>> implement some kind of general datatype-handling framework. Suddenly, >>>>> an app that could have been a one-off, three-line perl script blows up >>>>> into something that requires significantly more development effort. >>>>> >>>>> The RDF model is so simple. It would be nice if it could be processed >>>>> very simply whenever possible. "Make the simple cases simple", etc. >>>> The simplicity of the RDF model is, in my mind, tied up with its >>>> uniformity. Your proposal severely breaks that uniformity, which is a >>>> major lossage. >>>>>> peter >>>>>> >>>>>> PS: Yes, I do use text processors to handle RDF, and quite often, even >>>>>> analysing the 2011 Billion Triple Challenge triples using sed and grep. >>>>>> However, I check to ensure that the right thing happens. >>>>> Right, that's exactly the kind of simplified processing that I think we >>>>> should facilitate as often as possible. >>>>> >>>>> >>>> Sure, as long as it is only in one-off hacks, controlled by experts, who >>>> can adjust the processing according to the peculiarities of the input. >>>> As soon as direct expert control goes away, then the app needs to be >>>> able to consume all RDF, which I see as counter to your proposal. >>>> >>>> peter >>>> >>>> >>>> >>>>
Received on Wednesday, 1 August 2012 14:11:00 UTC