W3C home > Mailing lists > Public > public-gld-wg@w3.org > February 2012

Re: ISSUE-12 (valuesForDataFormat): What values to use to describe formats of dcat:Distribution? [DCAT]

From: Sarven Capadisli <sarven.capadisli@deri.org>
Date: Fri, 10 Feb 2012 14:39:17 +0000
Message-ID: <4F352C15.5040003@deri.org>
To: Richard Cyganiak <richard@cyganiak.de>
CC: Government Linked Data Working Group WG <public-gld-wg@w3.org>, Government Linked Data Working Group Issue Tracker <sysbot+tracker@w3.org>
On 12-02-09 08:33 PM, Richard Cyganiak wrote:
> On 9 Feb 2012, at 17:53, Sarven Capadisli wrote:
>> I favour option 3, provided that we do some homework and gather a list of formats that are currently published in the wild, and mint up those URIs e.g., http://www.w3.org/ns/formats/Shapefile .
>
> This working group is scheduled to end in April 2013, but that won't stop people from inventing new file formats afterwards. So this approach either dooms DCAT to built-in obsolescence (bad!), or imposes a long-term maintenance burden on someone (and who will that be?)

You are right.

> The Right Thing to do would be to get IETF to mint URIs for all media types, and get ESRI to register a media type for their file format, etc. This may not be feasible.

I agree.

> A review of the file formats present in typical popular catalogs may help to settle the question. Many use a small controlled vocabulary of formats anyways, while others use free text.

So, back to option 0, i.e., dcterms:format "format of the day" ?

At the end of the day, there is a trade-off here; either we make it 
easier for the publisher or the vocabulary maintainers. I think there is 
a higher probability of making things difficult for the publisher than 
there is for the vocabulary maintainers. Because, at any point in time, 
I'd wager that the maintainers are likely to compile a list of the 
existing, common formats out there.

Hence, I tend to favour options which makes things easier for the 
publishers: whether they can find the proper string or a URI for their 
format, and whether they have to update their dataset, if and when 
unknown formats become standardized.

I'd be content to see an agreement on the basic option 0. If a format is 
not standardized, they simply don't have to specify that information. 
Therefore, lowering the probability of unintended statements about their 
data, and it has the added benefit of giving the opportunity to add 
another triple once the format type is widely recognized. That is the 
least we can say as a recommended practice: "If it doesn't exist out 
there, don't worry about it". :) This recommendation can be said for 
URIs for file formats as well. IMHO, Unique URIs for File Formats is 
/safe enough/ in that regard.

-Sarven
Received on Friday, 10 February 2012 14:39:46 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:32:35 UTC