Re: DCAT - issues related to dcat:Distribution from John Erickson on 2012-03-22 (public-gld-wg@w3.org from March 2012)

From: John Erickson <olyerickson@gmail.com>
Date: Thu, 22 Mar 2012 07:12:08 -0400
To: "Maali, Fadi" <fadi.maali@deri.org>
Cc: phila@w3.org, public-gld-wg@w3.org
Message-ID: <CAC1Gg8SQcSLzo_cRRqQ5BuO_-2h1vLW7tgsNwNYQzGwCvsU-=w@mail.gmail.com>

Thanks for providing this summary, Fadi!

ISSUE 9: As I stated on last week's call, I believe that the
hierarchical dcat:Distribution model is necessary; we have examples
today, and it provides a path for future extension.

ISSUE 8: This one might be harder for some observers to understand,
but for those of us who have been building federated catalogs by
"scraping," the need to distinguish direct download is clear.
Automated systems (and the SPARQL queries powering them) need as much
help as they can get...

John

On Thu, Mar 22, 2012 at 6:22 AM, Maali, Fadi <fadi.maali@deri.org> wrote:
> Hi Phil, John, All,
>
> * Pleas pardon the long email!*
>
> I am summarizing below the discussions related to dcat issues 7,8 and 9 which are all related to dcat:Distribution
> 1)ISUUE 7 - drop dcat:accessUrl:https://www.w3.org/2011/gld/track/issues/7
> based on the discussion on the mailing list dcat:accessUrl is essential for distributions that are not direct downloads which is currently a very common case on catalogs. I suggest we close this issue with no actions required.
> 2) ISSUE 9 - dcat:Distribution and its subclasses are unnecessary: https://www.w3.org/2011/gld/track/issues/9
> dcat:Distribution is needed at least for the same reason that makes dcat:accessUrl needed. However, the subclasses (currently Download, Feed, WebService) need further discussion. The main reason for defining them was to enable some further processing of data based on the distribution class.
> I am slightly inclined towards dropping the subclasses if we define a way to distinguish between distributions that are directly accessible (i.e. clicking on their accessUrl gives back the data) and those that are indirectly accessible (e.g. have to click through a number of links on HTML pages to agree on license ore specify some search criteria). AFAIK, for both Feed and WebService, there exists some vocabularies specialised in describing them.
> 3) ISSUE 8 - add a property to distinguish direct and indirect access of dcat:Distribution https://www.w3.org/2011/gld/track/issues/8
> this distinction looks important but not sure if using dcterms:type along with two newly defined values as suggested in the issue description is the best approach.
> at a previous workshop (http://wiki.okfn.org/OpenDataCatalogues/2#Gold_Standard_Best_Practices_Proposal) I found a suggestion for the same issue better:
>
> "Perhaps: accessURL as the general property, and directURL with the same value if and only if we know that it's a direct download/endpoint URL."
>
> A third option would be to define two subclasses of dcat:Distribution (Direct and InDirect).
>
> I believe that the having two properties (accessURL and downloadURL) is elegant, minimal and SPARQL-friendly :-)
>
> To summarize, the three issues boil down to the distinction between directly- and indirectly- accessible distributions. There are three suggestion to make this distinction.
>
> Unfortunately, I'll be travelling today and won't be able to attend the call.
>
> Best regards,
> Fadi
>
>



-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
<http://tw.rpi.edu> <olyerickson@gmail.com>
Twitter & Skype: olyerickson

Received on Thursday, 22 March 2012 11:12:41 UTC