W3C home > Mailing lists > Public > public-gld-wg@w3.org > November 2013

Re: ACTION-155 Review domains and ranges as implied/stated by the spec and encoded in the schema

From: Phil Archer <phila@w3.org>
Date: Thu, 28 Nov 2013 10:52:50 +0000
Message-ID: <52972082.7030006@w3.org>
To: public-gld-wg@w3.org
Following this discussion and the clear consensus of the WG, I have 
aligned the domain definitions to the TTL file as discussed. Specifically,

dcat:theme, dcat:keyword, dcat:contactPoint all now have a domain of 
dcat:Dataset.

dcat:accessURL, dcat:downloadURL, dcat:byteSize & dcat:mediaType have a 
declared domain of dcat:Dataset.

This means that all properties in the schema now have filly defined 
domains and ranges.

IMO, these changes clarify the existing semantics but do not change them 
substantively. No text has been changed.

HTH

Phil.



On 22/11/2013 16:38, Phil Archer wrote:
> Thanks everyone, that's
>
> a) helpful
> b) clear
>
> I will act on that in the schema in the coming days.
>
> Phil.
>
> On 22/11/2013 15:59, Dave Reynolds wrote:
>> Thanks to Phil for the helpful review.
>>
>> +1 to Richard's comments.
>>
>> While it's true that specifically :theme, :keyword and :contactPoint
>> could be made general purpose concepts, dcat doesn't seem like the right
>> place to do that.
>>
>> Dave
>>
>> On 22/11/13 15:39, Richard Cyganiak wrote:
>>> Phil,
>>>
>>> 1. I object to the current situation where we have properties with
>>> implicitly stated domains in prose definitions and undefined domains
>>> in the RDFS.
>>>
>>> If the definition says “The XXX of the dataset”, then that’s an
>>> implicit domain declaration. In that case, either the explicit RDFS
>>> definition has to agree with the prose text (in other words, formally
>>> add a domain of dcat:Dataset), or the definition has to be changed (in
>>> other words, change the definition to “The XXX of the
>>> resource/entity/whatever”).
>>>
>>> I will accept either fix, but I see no way to intellectually justify
>>> semantic doublespeak.
>>>
>>>
>>> 2. Even though I will accept either way of fixing the situation, I
>>> would like to put the following on the record, so that I can point to
>>> it when people will inevitably complain about DCAT’s lack of domain
>>> declarations in the future:
>>>
>>> You want to leave domains in DCAT undeclared because “the properties
>>> might be useful in other contexts”. This reasoning seems flawed to me.
>>> Vocabularies have scope and purpose. The scope of DCAT is data
>>> catalogs. Hence the name: “Data Catalog Vocabulary”. The problem we
>>> need to solve is representing data catalogs as RDF, not fixing
>>> perceived omissions in SKOS. That would be a different WG with a
>>> different charter and different skills and backgrounds present at the
>>> table.
>>>
>>> If I was, let’s say, managing metadata about medieval paintings and
>>> would like to add index terms to the metadata, the suggestion that
>>> there’s a handy little property in the “data catalog vocabulary” for
>>> that purpose does more harm than good. People who do metadata for
>>> medieval paintings should not be expected to look at vocabularies for
>>> data catalogs. That just makes their life harder.
>>>
>>> If you disagree with this, then explain to me why DCAT doesn’t re-use
>>> the mpv:theme property, which would be perfectly suitable here if we
>>> only consider its formal RDFS definition. MPV is the Medieval
>>> Paintings Vocabulary [1].
>>>
>>> Best,
>>> Richard
>>>
>>>
>>> [1] It’s fictional, but let’s assume for the sake of argument that it
>>> exists, is well-designed, well-documented, well-maintained,
>>> well-established in the relevant community, and you’ve never heard of
>>> it before.
>>>
>>>
>>> On 22 Nov 2013, at 11:51, Phil Archer <phila@w3.org> wrote:
>>>
>>>> I took an action on yesterday's call  to review domain and range
>>>> statements for DCAT properties. This was sparked by a comment by Luke
>>>> Blaney [1] who said "... I found the inclusion of rdfs:domain on
>>>> Properties quite inconsistent.  In my view, all rdfs:Properties
>>>> should have rdfs:domain and rdfs:range specified."
>>>>
>>>> Let's see.
>>>>
>>>> dcat:theme
>>>> ==========
>>>> is defined as a sub property of dcterms:subject and has a range of
>>>> skos:Concept. The comment is "The main category of the dataset. A
>>>> dataset can have multiple themes." and the usage note says: "The set
>>>> of skos:Concepts used to categorize the datasets are organized in a
>>>> skos:ConceptScheme describing all the categories and their relations
>>>> in the catalog."
>>>>
>>>> So we have an unambiguous range of skos:Concept.
>>>>
>>>> Domain? It has always seemed odd to me that SKOS doesn't include a
>>>> property like this (no doubt there were reasons) - bototm line - this
>>>> looks like a property that could be useful for linking a class other
>>>> than a dcat:Dataset to a skos:Concept.
>>>>
>>>> Recommendation: - leave domain undefined.
>>>>
>>>> dcat:keyword
>>>> ============
>>>> Comment: "A keyword or tag describing the dataset."
>>>> (No usage note)
>>>> Range: rdfs:Literal
>>>>
>>>> Domain is undefined. My immediate thought is that this is a very
>>>> useful little property in any number of circumstances within and
>>>> outwith a data catalogue and so we should leave the domain undefined.
>>>>
>>>> *However* the definition clearly says "it's for describing the
>>>> dataset" and that will make some people think it's not for them when
>>>> describing something like a PDF. Now, we know that's not the case -
>>>> we have defined dcat:Dataset as "A collection of data, published or
>>>> curated by a single source, and available for access or download in
>>>> one or more formats" and we agreed that this is about as close to the
>>>> definition of "anything digital" as makes little difference - but the
>>>> perception would be that dcat:keyword is not usable outside a
>>>> catalogue when actually it is.
>>>>
>>>> Recommendation: - leave domain undefined.
>>>>
>>>> dcat:contactPoint
>>>> =================
>>>> Comment: Links a dataset to relevant contact information which is
>>>> provided using VCard.
>>>>
>>>> The range is defined as v:VCard but domain is undefined.
>>>>
>>>> VCard has been updated recently [2] with the VCard class being
>>>> deprecated in favour of v:Kind - except that old and new are declared
>>>> being equivalent classes. That means that we could leave the
>>>> definition the same and change the range to v:Kind - I have no strong
>>>> feeling either way but tend towards changing it to v:Kind.
>>>>
>>>> The domain is undefined and I would say that the same arguments apply
>>>> here as for dcat:keyword. Defining the domain as dcat:Dataset
>>>> actually wouldn't restrict the usage but might *appear* to do so in a
>>>> way that is probably unhelpful.
>>>>
>>>> Recommendation: Update range to v:Kind and leave domain undefined.
>>>>
>>>> dcat:accessURL, dcat:downloadURL
>>>> ================================
>>>>
>>>> accessURL definition: Could be any kind of URL that gives access to a
>>>> distribution of the dataset. E.g. landing page, download, feed URL,
>>>> SPARQL endpoint. Use when your catalog does not have information on
>>>> which it is or when it is definitely not a download.
>>>>
>>>> downloadURL definition: This is a direct link to a downloadable file
>>>> in a given format. E.g. CSV file or RDF file. The format is described
>>>> by the distribution's dc:format and/or dcat:mediaType
>>>>
>>>> The range is currently defined as rdfs:Resource for both which seems
>>>> sensible, but there's no domain defined and I'm struggling to think
>>>> of a use case where either would not refer to a dcat:Distribution. In
>>>> the absence of that it seems to me that the domain for both
>>>> properties should be defined as dcat:Distribution.
>>>>
>>>> Recommendation: define the domain of dcat:accessURL and
>>>> dcat:downloadIRL as dcat:Distribution
>>>>
>>>> N.B. This is orthogonal to the resolution to adopt Dave's second
>>>> suggestion on how to handle Luke's comments an these two properties.
>>>>
>>>> dcat:byteSize & dcat:mediaType
>>>> ==============================
>>>>
>>>> dcat:byteSize definition The size of a distribution in bytes.
>>>> Usage Note: The size in bytes can be approximated when the precise
>>>> size is not known. The literal value of dcat:byteSize should by typed
>>>> as xsd:decimal
>>>>
>>>> dcat:mediaType definition This property SHOULD be used when the media
>>>> type of the distribution is defined in IANA, otherwise dct:format MAY
>>>> be used with different values.
>>>>
>>>> Both of these have defined ranges but no defined domain. Either could
>>>> be used in contexts other than a catalogue and the definition of a
>>>> Distribution, i.e. "Represents a specific available form of a
>>>> dataset. Each dataset might be available in different forms, these
>>>> forms might represent different formats of the dataset or different
>>>> endpoints. Examples of distributions include a downloadable CSV file,
>>>> an API or an RSS feed" - seems more restrictive than we might wish
>>>> for these two.
>>>>
>>>> Recommendation: leave domain undefined for these two.
>>>>
>>>> For the sake of completeness, I'll list the properties for which both
>>>> domain and range are already specified:
>>>>
>>>> dcat:themeTaxonomy
>>>> dcat:dataset
>>>> dcat:record
>>>> dcat:distribution
>>>> dcat:landingPage
>>>>
>>>>
>>>> [1]
>>>> http://lists.w3.org/Archives/Public/public-gld-comments/2013Nov/0017.html
>>>>
>>>>
>>>>
>>>> [2] http://www.w3.org/TR/2013/WD-vcard-rdf-20130924/
>>>>
>>>> --
>>>>
>>>> Phil Archer
>>>> W3C eGovernment
>>>>
>>>> http://philarcher.org
>>>> +44 (0)7887 767755
>>>> @philarcher1
>>>>
>>>
>>>
>>
>>
>>
>

-- 

Phil Archer
W3C eGovernment

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Thursday, 28 November 2013 10:53:21 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:32:40 UTC