Conflicts between Cube and DCAT in metadata properties (was: Re: AW: AW: [QB] Last Call document draft)

Benedikt, Dave,

DCAT and Data Cube unfortunately disagree somewhat on the metadata properties that are to be used. I think it's worth still fixing those disagreements now.

On 8 Mar 2013, at 11:14, Benedikt Kaempgen wrote:
>> Ah, good point. Of course that section was originally written before
>> DCAT. I'm not sufficiently familiar with DCAT to suggest what to put
>> here. DCAT seems like it would be more used to describe a catalogue
>> entry that would reference a Data Cube rather than a Data Cube
>> itself.

DCAT has both dcat:CatalogRecord and dcat:Dataset. The second is thought to be the dataset itself.

>> Richard - if you feel there is a useful reference to DCAT to be made
>> here I'm happy for you to make that change and will trust your
>> judgement on it.
> 
> Richard told me recently that in his opinion, a qb:DataSet is a dcat:Dataset. If so, we should either say so formally in the RDF and/or informally in prose.

Well, it is my view that qb:DataSet is a subset of dcat:Dataset. But if someone thinks it better to have one more triple between the two entities that explicates the relationship, then I have no problem with that.

The metadata properties recommended in both documents are somewhat different:

Data Cube                | DCAT
-------------------------+-------------------------
rdfs:label               | dc:title
rdfs:comment             | dc:description
dc:date                  | dc:issued, dc:modified
dc:subject->skos:Concept | dcat:theme->skos:Concept
dc:publisher->foaf:Agent | dc:publisher->foaf:Agent

Data Cube is a bit self-contradictory here, it says “We recommend use of Dublin Core Terms for representing the key metadata annotations commonly needed for DataSets.” but then uses rdfs:label and rdfs:comment in preference of DC, perhaps to be consistent with the many other places in Cube where these properties are used. Since a dataset can be thought of as an abstract "document", "work" or "publication", the use of DC seems more appropriate to me here.

Cube also implies that dc:date is the creation date, which is not quite correct. dc:date could be any date in the lifecycle of the resource.

Proposed steps for aligning the two specs:

- recommend dc:title/description on qb:DataSet in addition to rdfs:label/comment
- recommend dc:issued instead of dc:date for creation date in Data Cube
- Add a note to beginning of Data Cube Section 9 that says that other documents such as DCAT have additional recommendations for metadata properties.
- make dcat:theme a subproperty of dc:subject in DCAT

The result would be that if you follow the DCAT recommendations, you end up with something that matches the Cube recommendations, except for the use of a subproperty in the case of dcat:theme vs dc:subject.

Best,
Richard

Received on Friday, 8 March 2013 11:58:55 UTC