W3C home > Mailing lists > Public > public-egov-ig@w3.org > May 2010

Re: [dcat] Tomorrow's agenda

From: Ed Summers <ehs@pobox.com>
Date: Thu, 27 May 2010 05:29:42 -0400
Message-ID: <AANLkTilHwCNWEfRZhMASKMe0v_heojj95Uw3gGS0tDf5@mail.gmail.com>
To: Richard Cyganiak <richard@cyganiak.de>
Cc: public-egov-ig IG <public-egov-ig@w3.org>
Unfortunately I've got another meeting conflict this week, so regrets
again for today. I've included some notes below, which may or may not
be of some use.

On Wed, May 26, 2010 at 6:49 PM, Richard Cyganiak <richard@cyganiak.de> wrote:

> == Use Cases and Requirements ==
>
> * See
> http://www.w3.org/egov/wiki/Data_Catalog_Vocabulary/Use_Cases_and_Requirements
> * Can we declare this finished?

I think it looks good, nice work!

> == Vocabulary Reference ==
>
> * See
> http://www.w3.org/egov/wiki/Data_Catalog_Vocabulary/Vocabulary_Reference
> * Review current state

I wonder if it is worthwhile acknowledging (at least to ourselves)
that the ranges of dct:publisher, dct:accrualPeriodicity, dct:spatial,
dct:temporal, dcat:granularity, dcat:theme could be at odds with the
Simple Transformation From Existing Catalog Data requirement.  For
example a dataset publisher may know that the dataset is about
"Berlin, Germany" ... but they would have some work to do to figure
out what URI to use with dct:spatial. Similarly they may know that a
dataset is published by the National Aeronautics and Space
Administration, but they will have to do some work to use a linkeddata
friendly URI like <http://dbpedia.org/resource/NASA>.

Should the range for dct:license be dct:LicenseDocument (as specified
in the dcterms vocabulary) instead of rdfs:Resource? Also, I was
wondering if it might be appropriate to use foaf:Document instead of
rdfs:Resource as the range on the dcat:dataDictionary. We're talking
about referencing an actual web document (aka information resource)
right?

I must admit I am a little bit perplexed by the use of dcat:accessUrl
to describe a dcat:Download. The usage note indicates that:

"""
accessUrl of the Download distribution should be a direct download
link (a one-click access to the data file).
"""

It makes me wonder if we should instead be recommending that the URI
for the dcat:Download be the actual URI for the download. So for
example:

  ex:dataset1 a dcat:Dataset ;
      dcat:distribution ex:download1 .

  ex:download1 a dcat:Download ;
      dcat:accessURL  <http://example.gov/downloads/1> ;
      dct:format "text/csv" .

would become:

  ex:dataset1 a dcat:Dataset ;
      dcat:distribution <http://example.gov/downloads/1> .

  <http://example.gov/downloads/1> dct:format "text/csv" .

See how the intermediary resource (probably a blank node in practice)
goes away? I think the same could be said of dcat:Feed. To some extent
I'm not convinced that dcat:Distribution and its subclasses are really
necessary. An alternate approach would be two have two different
properties for linking a dcat:Dataset to an a web resource:
dcat:download (for direct download) and dcat:downloadInfo (or
something else, for going to a page that describes how to
download)...and let the rest be handled by media types and web
architecture. To some extent you can see this principle at work in the
notes currently in Catalog Record:

"""
@@@ in web-based catalogs, the URL of the catalog page should be used
as URI for the catalog record if it is a permalink.
"""

Speaking of the linking properties, it appears that the vocab wiki
page [2] is missing definitions for dcat:dataset, dcat:distribution
and dcat:record for tying together instances of dcat:Catalog,
dcat:CatalogRecord and dcat:Distribution?

Thanks for reading this far :-)

//Ed

[1] http://www.w3.org/egov/wiki/Data_Catalog_Vocabulary/Use_Cases_and_Requirements#Simple_transformation_from_existing_catalog_data
[2] http://www.w3.org/egov/wiki/Data_Catalog_Vocabulary/Vocabulary_Reference
Received on Thursday, 27 May 2010 09:30:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 27 May 2010 09:30:25 GMT