Re: ISSUE-3 (DTF): Date and Time Format

On 17 Jan 2012, at 14:16, Phil Archer wrote:
> All being well we can tie down a lot of DCAT next week - and resolve to publish a FPWD :-)

A strong +1 to that intention :-)

Richard


> 
> Phil.
> 
> On 17/01/2012 14:04, Richard Cyganiak wrote:
>> Hi Phil,
>> 
>> On 17 Jan 2012, at 12:43, Phil Archer wrote:
>>>> Q2: Should the date format allow placeholders such as “200?” for the previous decade or “2011-00-00” where month and date are unknown?
>>>> 
>>>> A: No. This is not allowed in W3CDTF or XML Schema Datatypes or ISO 8601 or SQL or any other date spec I'm aware of. Existing date code such as Java's java.util.Date or PHP's strtotime will in the best case just barf, and in the worst case produce nonsense such as turning 2012-02-00 into 2012-01-31. I'm also not aware of any existing government data catalog that codes dates in this notation, or in any other way that can be automatically transformed into this notation. We should not recommend a notation that requires manual re-coding and is incompatible with everything.
>>> 
>>> It wouldn't be typed, it's a plain literal so there is no effective difference between 198? and "Sometime in the 1980s".
>> 
>> Ok, I misunderstood the intent here, I thought you wanted to use "198?"^^dcterms:W3CDTF and the like. I agree that "198?" is fine as a plain literal. Although I'm not sure if it's worth pointing out this notation specifically for dealing with uncertain dates.
>> 
>>> It's this forcing of a square peg (xsd:date) into the round hole of reality that I don't like about using xsd:date. It's fine if you always know the full date but it's bad when dealing with approximations which is common in things like registers of people's dates and places of birth.
>> 
>> If dcat was a format for birth registers then I'd totally agree. But it's not, so I don't find this too compelling. We're dealing with publication dates of digital artefacts. The majority of those are sufficiently well-recorded, so we can require one-day precision. There are some exceptions of course, where the date is not exactly known, but I think that allowing plain literals should be sufficient to handle these cases.
>> 
>>>> Q3: Should the date format be dcterms:W3CDTF instead of xsd:date in order to support less specific dates such as yyyy and yyyy-mm?
>>>> 
>>>> A: No. If at all, then it should allow the W3C-recommended datatypes xsd:gYear and xsd:gYearMonth in addition to xsd:date. But I would prefer not to go there as it makes the creation of clients significantly harder (e.g., correct ordering and filtering of dates). The current approach of filling in 01 for unknown months and dates is a good compromise between simplicity and representational fidelity, IMO.
>>> 
>>> I agree that DCAT should not use xsd:gYear etc.
>> 
>> Well, I think that xsd:gYear and xsd:gYearMonth would be preferable over dcterms:W3CDTF because the former are W3C-recommended, while the latter is not. But as I said I don't really like either option.
>> 
>> FWIW, in the BTC2011 dataset, xsd:dateTime is the 2nd most frequent datatype, xsd:date is at #7, xsd:gYear at #9. W3CDTF is not in the top 20. Given that the XSD types are widely used, while the W3CDTF type is not (while being able to represent exactly the same values), we should encourage the use of XSD types.
>> 
>> Best,
>> Richard
>> 
>> 
>> 
>>> 
>>> Phil.
>>> 
>>>> 
>>>> 
>>>> On 6 Jan 2012, at 15:19, Government Linked Data Working Group Issue Tracker wrote:
>>>> 
>>>>> 
>>>>> ISSUE-3 (DTF): Date and Time Format
>>>>> 
>>>>> http://www.w3.org/2011/gld/track/issues/3
>>>>> 
>>>>> Raised by: Phil Archer
>>>>> On product:
>>>>> 
>>>>> The current version of DCAT seems a little confused wrt date and time formats. We use dcterms:issued and repeat the DC range declaration of rdfs:Literal and then say it should be datatyped as xsd:date. So far so good. But then the text refers to the W3CDTF document. And they're not the same.
>>>>> 
>>>>> xsd:date requires that values be present for yyyy-mm-dd
>>>>> 
>>>>> W3CDTF is more flexible and allows any of:
>>>>> yyyy
>>>>> yyyy-mm
>>>>> yyyy-mm-dd (and then times can be added)
>>>>> 
>>>>> The DCAT spec says that if a day and/or month are not known then one should use the value 01. This assumes:
>>>>> 
>>>>> - that the year is always known;
>>>>> - that a date like 2012-01-06 is ambiguous since it includes '01'.
>>>>> 
>>>>> There may be cases in which the year is not known. For example, 'the 1980s' might be written as 198?. That breaks W3CDTF but it's an approximation. As it happens this came up just yesterday in the EU work that Christophe and I are doing so it's fresh in my mind. Taking all that on board, my proposal is therefore that:
>>>>> 
>>>>> 1. Rather than specify a datatype of xsd:date we specify W3CDTF (which is what DC recommends). We can use the URI http://purl.org/dc/terms/W3CDTF to give the data type.
>>>>> 
>>>>> 2. We recommend using '00' not '01' for unknown dates.
>>>>> 
>>>>> 3. We explain that just giving the year or the year and month is valid.
>>>>> 
>>>>> 4. Where the year is uncertain, use the ? character to express this but recognise that this breaks the model and is not W3CDTF. Therefore the data should not be so typed.
>>>>> 
>>>>> 5. Where even strings like 198? cannot be provided, plain text such as "sometime in the 1970s or '80s" may be used but this should be avoided if at all possible.
>>>>> 
>>>>> Given DCAT's use cases the latter seems unlikely (it happens in public sector records for things like dates of birth) so maybe we could drop that bit, but 1 - 4 seem valid?
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> 
>>> 
>>> Phil Archer
>>> W3C eGovernment
>>> http://www.w3.org/egov/
>>> 
>>> http://philarcher.org
>>> +44 (0)7887 767755
>>> @philarcher1
>>> 
>> 
>> 
>> 
> 
> -- 
> 
> 
> Phil Archer
> W3C eGovernment
> http://www.w3.org/egov/
> 
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
> 

Received on Tuesday, 17 January 2012 19:20:32 UTC