W3C home > Mailing lists > Public > public-gld-wg@w3.org > January 2013

Re: dcat:accessURL issue

From: John Erickson <olyerickson@gmail.com>
Date: Thu, 31 Jan 2013 09:17:04 -0500
Message-ID: <CAC1Gg8QCujo-ZWuRCz24bgroCO_26L4DkoTGFr_X16XJKkooYw@mail.gmail.com>
To: Makx Dekkers <mail@makxdekkers.com>
Cc: fadi.maali@deri.org, Richard Cyganiak <richard@cyganiak.de>, Public GLD WG <public-gld-wg@w3.org>
+1 to dealing with dcat:accessURL on the call today; my team has
already expressed to Fadi specific concerns we have.

Our concerns are in the "other" direction, however. In our experience
in automatically processing govt data sites that describe datasets
with DCAT, we are having difficulty with the loose interpretation and
use of dcat:accessURL . Very simply, the description "...points to the
location of a distribution. This can be a direct download link, a link
to an HTML page containing a link to the actual data, Feed, Web
Service etc. the semantic is determined by its domain (Distribution,
Feed, WebService, Download)..." is too loose.

The problem is seen on Fadi's excellent examples page:
http://www.w3.org/2011/gld/wiki/Dcat_examples

Look at Examples 1 ("A dataset available as a downloadable file (a CSV
file for example)") and 2 ("A dataset that is available through some
web page"); the coding is exactly the same. This sort of coding is
breaking systems in place that are depending on both the primary
(dcat:accessURL) and secondary (dcat:format) terms.

I think we would be happy with a clear recommendation on how to do
disambiguate Example 1 from Example 2...The problem is that Example 2
is trying to refer to the downloadable file with a link somewhere on
the page, but that's not how executable code sees it ;)

John

On Thu, Jan 31, 2013 at 8:57 AM, Makx Dekkers <mail@makxdekkers.com> wrote:
>
> Dear Fadi, Richard,
>
>
>
> In preparation for the DCAT call this afternoon, I wanted to raise an issue concerning the property dcat:accessURL.
>
>
>
> First of all, it was pointed out to me that there is a difference in definition between the latest editor’s draft [1] (no header, no date) and the Turtle spec [2]. In [1] the range is rdfs:Literal and in [2] it is rdfs:Resource.
>
>
>
> Now, if I am allowed to express my opinion here, I think an accessURL can only be a rdfs:Literal and not a rdfs:Resource, even if it looks like a URI.
>
>
>
> The main problem that I see is that an accessURL is a string of characters that happen to be a URL. However, the URL is not the thing it points to. In a way, on the theoretical level, saying that the URL is the resource would be equivalent to saying that the range of a name is a resource if it happens to look like a URL.
>
>
>
> But on a practical level, while I can say:
>
>
>
> “The foaf:name ‘Makx Dekkers’ contains 12 characters”,
>
>
>
> if I were to say
>
>
>
> “the dcat:accessURL ‘http://t.co/xyz’ contains 15 characters”
>
>
>
> it would mean that this *string* has 15 characters if it is declared as an rdfs:Literal, but it would mean that the *document* at that URL is  15 characters long.
>
>
>
> Even worse, if it is defined as a resource,  I would not be able to make a statement like:
>
>
>
> “the dcat:accessURL ‘http://t.co/xyz’ is a valid URI” (unless of course the *document*contains text that is a valid URI).
>
>
>
> The second problem is that the definition of accessURL as resource seems to use URL and URI interchangeably. While I agree that it is true that every URL is a valid URI (by definition), the converse is not true. I read in RFC 3986:
>
>
>
> The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location").
>
>
>
> In this sense, I think the value of dcat:accessURL is not always a URI as the issue listed just above the definition of accessURL in https://dvcs.w3.org/hg/gld/raw-file/default/dcat/index.html states.
>
>
>
> Makx.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> [1] https://dvcs.w3.org/hg/gld/raw-file/default/dcat/index.html
>
> [2] http://www.w3.org/ns/dcat.ttl
>
>
>
>
>
> Makx Dekkers
>
> makx@makxdekkers.com
>
> +34 639 26 11 46
>
>
>
>




--
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
<http://tw.rpi.edu> <olyerickson@gmail.com>
Twitter & Skype: olyerickson
Received on Thursday, 31 January 2013 14:17:35 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 25 June 2013 15:04:58 UTC