W3C home > Mailing lists > Public > public-gld-wg@w3.org > November 2013

ACTION-155 Review domains and ranges as implied/stated by the spec and encoded in the schema

From: Phil Archer <phila@w3.org>
Date: Fri, 22 Nov 2013 11:51:02 +0000
Message-ID: <528F4526.4060909@w3.org>
To: Public GLD WG <public-gld-wg@w3.org>
I took an action on yesterday's call  to review domain and range 
statements for DCAT properties. This was sparked by a comment by Luke 
Blaney [1] who said "... I found the inclusion of rdfs:domain on 
Properties quite inconsistent.  In my view, all rdfs:Properties should 
have rdfs:domain and rdfs:range specified."

Let's see.

dcat:theme
==========
is defined as a sub property of dcterms:subject and has a range of 
skos:Concept. The comment is "The main category of the dataset. A 
dataset can have multiple themes." and the usage note says: "The set of 
skos:Concepts used to categorize the datasets are organized in a 
skos:ConceptScheme describing all the categories and their relations in 
the catalog."

So we have an unambiguous range of skos:Concept.

Domain? It has always seemed odd to me that SKOS doesn't include a 
property like this (no doubt there were reasons) - bototm line - this 
looks like a property that could be useful for linking a class other 
than a dcat:Dataset to a skos:Concept.

Recommendation: - leave domain undefined.

dcat:keyword
============
Comment: "A keyword or tag describing the dataset."
(No usage note)
Range: rdfs:Literal

Domain is undefined. My immediate thought is that this is a very useful 
little property in any number of circumstances within and outwith a data 
catalogue and so we should leave the domain undefined.

*However* the definition clearly says "it's for describing the dataset" 
and that will make some people think it's not for them when describing 
something like a PDF. Now, we know that's not the case - we have defined 
dcat:Dataset as "A collection of data, published or curated by a single 
source, and available for access or download in one or more formats" and 
we agreed that this is about as close to the definition of "anything 
digital" as makes little difference - but the perception would be that 
dcat:keyword is not usable outside a catalogue when actually it is.

Recommendation: - leave domain undefined.

dcat:contactPoint
=================
Comment: Links a dataset to relevant contact information which is 
provided using VCard.

The range is defined as v:VCard but domain is undefined.

VCard has been updated recently [2] with the VCard class being 
deprecated in favour of v:Kind - except that old and new are declared 
being equivalent classes. That means that we could leave the definition 
the same and change the range to v:Kind - I have no strong feeling 
either way but tend towards changing it to v:Kind.

The domain is undefined and I would say that the same arguments apply 
here as for dcat:keyword. Defining the domain as dcat:Dataset actually 
wouldn't restrict the usage but might *appear* to do so in a way that is 
probably unhelpful.

Recommendation: Update range to v:Kind and leave domain undefined.

dcat:accessURL, dcat:downloadURL
================================

accessURL definition: Could be any kind of URL that gives access to a 
distribution of the dataset. E.g. landing page, download, feed URL, 
SPARQL endpoint. Use when your catalog does not have information on 
which it is or when it is definitely not a download.

downloadURL definition: This is a direct link to a downloadable file in 
a given format. E.g. CSV file or RDF file. The format is described by 
the distribution's dc:format and/or dcat:mediaType

The range is currently defined as rdfs:Resource for both which seems 
sensible, but there's no domain defined and I'm struggling to think of a 
use case where either would not refer to a dcat:Distribution. In the 
absence of that it seems to me that the domain for both properties 
should be defined as dcat:Distribution.

Recommendation: define the domain of dcat:accessURL and dcat:downloadIRL 
as dcat:Distribution

N.B. This is orthogonal to the resolution to adopt Dave's second 
suggestion on how to handle Luke's comments an these two properties.

dcat:byteSize & dcat:mediaType
==============================

dcat:byteSize definition The size of a distribution in bytes.
Usage Note: The size in bytes can be approximated when the precise size 
is not known. The literal value of dcat:byteSize should by typed as 
xsd:decimal

dcat:mediaType definition This property SHOULD be used when the media 
type of the distribution is defined in IANA, otherwise dct:format MAY be 
used with different values.

Both of these have defined ranges but no defined domain. Either could be 
used in contexts other than a catalogue and the definition of a 
Distribution, i.e. "Represents a specific available form of a dataset. 
Each dataset might be available in different forms, these forms might 
represent different formats of the dataset or different endpoints. 
Examples of distributions include a downloadable CSV file, an API or an 
RSS feed" - seems more restrictive than we might wish for these two.

Recommendation: leave domain undefined for these two.

For the sake of completeness, I'll list the properties for which both 
domain and range are already specified:

dcat:themeTaxonomy
dcat:dataset
dcat:record
dcat:distribution
dcat:landingPage


[1] 
http://lists.w3.org/Archives/Public/public-gld-comments/2013Nov/0017.html

[2] http://www.w3.org/TR/2013/WD-vcard-rdf-20130924/

-- 

Phil Archer
W3C eGovernment

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Friday, 22 November 2013 11:51:34 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:32:40 UTC