Re: DCAT - issues related to dcat:Distribution

Hi Fadi,

> 1)ISUUE 7 - drop dcat:accessUrl:https://www.w3.org/2011/gld/track/issues/7
> based on the discussion on the mailing list dcat:accessUrl is essential for distributions that are not direct downloads which is currently a very common case on catalogs. I suggest we close this issue with no actions required.

+1 to close this issue.

> 2) ISSUE 9 - dcat:Distribution and its subclasses are unnecessary: https://www.w3.org/2011/gld/track/issues/9
> dcat:Distribution is needed at least for the same reason that makes dcat:accessUrl needed. However, the subclasses (currently Download, Feed, WebService) need further discussion. The main reason for defining them was to enable some further processing of data based on the distribution class.
> I am slightly inclined towards dropping the subclasses if we define a way to distinguish between distributions that are directly accessible (i.e. clicking on their accessUrl gives back the data) and those that are indirectly accessible (e.g. have to click through a number of links on HTML pages to agree on license ore specify some search criteria). AFAIK, for both Feed and WebService, there exists some vocabularies specialised in describing them.

I don't have a clear opinion on this, but this is a pragmatical point
of view. According to our experience with catalogs, I find these
subclasses hard to instance automatically. In most of all the projects
we have seen, these subclasses were not used in order to simplify the
system. The person who inputs datasets into the catalog have to select
the format of the distribution. Using the subclasses involves
including another field to fill in, sometimes redundant (E.g.,
"application/soap+xml"->"WebService" and
"application/atom+xml"->"Feed"). On the other hand I understand the
benefits and reasons to include them.

> 3) ISSUE 8 - add a property to distinguish direct and indirect access of dcat:Distribution https://www.w3.org/2011/gld/track/issues/8
> this distinction looks important but not sure if using dcterms:type along with two newly defined values as suggested in the issue description is the best approach.
> at a previous workshop (http://wiki.okfn.org/OpenDataCatalogues/2#Gold_Standard_Best_Practices_Proposal) I found a suggestion for the same issue better:
>
> "Perhaps: accessURL as the general property, and directURL with the same value if and only if we know that it's a direct download/endpoint URL."
>
> A third option would be to define two subclasses of dcat:Distribution (Direct and InDirect).
>
> I believe that the having two properties (accessURL and downloadURL) is elegant, minimal and SPARQL-friendly :-)
>

I prefer to define both subclasses, Direct and InDirect (or similar).
I don't like having a new property (downloadURL). One of the use cases
motivating this issue is, for instance, a zipped CSV file. In this
case, the objects of accessURL and downloadURL would be the same, and
there is way to define that is a indirect access to the data in CSV
format.

Best regards,

Martin


-- 
Martin Alvarez Espinar
W3C Spain Office Manager        tel.:+34 984390616
http://www.w3c.es/Personal/Martin   mlvarez@w3.org

Received on Thursday, 22 March 2012 12:25:24 UTC