Re: Overlap between DCAT and ADMS from Phil Archer on 2012-10-25 (public-gld-wg@w3.org from October 2012)

From: Phil Archer <phila@w3.org>
Date: Thu, 25 Oct 2012 15:08:10 +0100
To: Richard Cyganiak <richard@cyganiak.de>
CC: "Gofran (GS)" <gofran.shukair@deri.org>, Public GLD WG <public-gld-wg@w3.org>
Message-ID: <508947CA.9010802@w3.org>
On 25/10/2012 10:34, Richard Cyganiak wrote:

[..]

>
> I will end by pointing out that you've pulled a bait-and-switch with us.

Not a term I'm familiar with but I'll go with it.

  You've submitted a DCAT extension for consideration to the WG. The WG 
agreed that it makes sense to take that on as a deliverable, as it 
seemed like a valuable addition that actually strengthens DCAT by 
extending its applicability. Much later in the WG's lifetime, you 
informed the WG that actually ADMS is no longer a DCAT extension but its 
completely separate -- and somewhat competing -- thing.

Sorry, that's simply not true. ADMS has not changed since at all since 
coming under the WG's control. What has changed recently is that it's 
started to be discussed. Any issues coming to light now would have come 
to light whenever it was discussed. Happy to be shouted at for things 
for which I am responsible but not for this.

  I'm not sure if this is the result of clever political planning or of 
the fact that you're sitting between a number of chairs with W3C and the 
EC, but at any rate this has unfortunately made the WG's work 
considerably more difficult and is doing bad things to our schedule.

Being caught between a number of masters is the reality. That said, 
irrespective of the agreement made between W3C and the EC, the WG is 
sovereign and can decide whatever it likes. Yes, I'll argue for what is 
the best outcome from my point of view, but that doesn't make me unique 
around the table ;-)


>
> Is the commonality between DCAT, ADMS and ADMS.SW that they all define repositories of metadata records that describe things of some nature, in order to allow finding of things and aggregating/federating/harvesting of repositories?

Yes.

  And the difference is in the nature of the things described therein? 
Web-accessible datasets (DCAT), interoperability specifications (ADMS), 
and software thingies (ADMS.SW)? If that is so, then it may be possible 
to pull out the “repository” aspects into a separate technical piece.

I called it RADion.

  But that may exceed the WG scope by quite a bit and step on various 
toes including OAI-ORE, AtomOwl, SIOC, and ISO 11179.

Ack.

More to come no doubt...

Phil



>>
>>
>>
>> On 21/10/2012 17:29, Richard Cyganiak wrote:
>>> Hi Gofran, hi Phil,
>>>
>>> I think the fundamental problem here is that we have two specs that have a large overlap in scope, but neither is a subset of the other, and *probably* neither can be easily extended to cover the other without losing its focus.
>>>
>>> What is the overlap between both?
>>>
>>> Phil developed RADion in an attempt to “factor out” the overlap of both specs: repositories, assets, distributions. But I think that RADion fails to get to the essence of the overlap. It may be correct on the repositories and assets, but fails with the distributions, or at least has a conception of distribution that isn't sufficiently generic to properly cover DCAT.
>>>
>>> I think the overlap of DCAT and ADMS is that both are catalogs of metadata records designed for finding “assets” of some kind. They differ, however, in the kind of assets that are listed in the catalog, although there is overlap.
>>>
>>> Since the kinds of assets are different, there's a lot of difference in the metadata that is required to adequately describe them, and in the additional secondary concepts related to the asset, and in the relationships that need to be recorded between assets, and in the means of accessing the assets.
>>>
>>> I'm not suggesting any particular course of action as a result of this observation. We should closely study the overlap between DCAT's Catalog, Dataset, and CatalogRecord on the one hand, and ADMS' Repository and Asset on the other hand, and also study their relationships to things already out there.
>>>
>>> The more I think about it, the more I get worried that we're in the process of not just reinventing the wheel, but reinventing it twice, in parallel.
>>>
>>> Best,
>>> Richard
>>>
>>>
>>> On 20 Oct 2012, at 19:37, Gofran wrote:
>>>
>>>> Hi all,
>>>>
>>>> The sets of resources that ADMS and DCAT describe are intersected, IMHO, I like to use the term "reusable" to point to the resources that ADMS describes and the set of reusable resources include (but not limited to) codelists, taxonomies, datasets ...etc as long as they can be reused in a diffrent context  and the ADMS purpose is to facilitate this by describing them using the right terms.
>>>>
>>>> A dataset in data.gov.uk for instance is basically a useful dataset for certain applications but it is not a reference dataset and DCAT should be used to describe it.
>>>> While a dataset about the languages in the EU is certainly more like a reusable asset that has broader usage base and ADMS and/or DCAT can be used to describe it (though it is not a "semantic" asset per se)
>>>>
>>>> The problem as I see it , how to extend ADMS (or DCAT or both) to describe this difference .
>>>>
>>>> On 20 Oct 2012, at 14:17, Richard Cyganiak wrote:
>>>>
>>>>> Hi Phil,
>>>>>
>>>>> On 18 Oct 2012, at 18:33, Phil Archer wrote:
>>>>>> "ADMS, the Asset Description Metadata Schema, is a vocabulary for describing so-called Semantic Assets, that is, things like standards, code lists and taxonomies. Although it has a lot in common with the Data Catalog vocabulary [DCAT], notably the extensive use of Dublin Core [DC11], someone searching for a Semantic Asset is likely to have different needs, priorities and expectations than someone searching for a data set and these differences are reflected in ADMS. In particular, users seeking a Semantic Asset are likely to be searching for 'a document' — something they can open and read using familiar desktop software, as opposed to something that needs to be processed. Of course this is a very broad generalization. If a code list is published as a SKOS Concept scheme then it is both a Semantic Asset and a dataset and it can be argued that all Semantic Assets are datasets. Therefore the difference in /user expectation/ is at the heart of what distinguishes ADMS from DCAT."
>>>>>
>>>>> I have a number of issues with this.
>>>>>
>>>>> 1. You describe the purpose of ADMS as: “It's for describing things like standards, code lists and taxonomies.” This is too fuzzy. You can't have weasel words such as “like” in the sentence that states the purpose of a technology. Law texts are a bit like standards, right? So ADMS is for describing them too?
>>>>>
>>>>> 2. The text implies that the kinds of things described in DCAT cannot be “open and read using familiar desktop software”. This is not the case. In most data catalogs, the most common formats are CSV and Excel.
>>>>>
>>>>> 3. It is not particularly likely that code lists and taxonomies -- things that ADMS is intended to describe -- can be opened and read in familiar desktop software.
>>>>>
>>>>> 4. If the main difference is indeed one of user expectation and not one of vocabulary semantics, then a catalog-level flag in DCAT might be sufficient to eliminate the need for ADMS. Surely it is not that easy. So I don't feel that the text above gets to the heart of the difference between DCAT and ADMS.
>>>>>
>>>>> 5. It is somewhat open whether the “distributions” in DCAT are all machine-readable. There is an open DCAT issue about renaming “distribution” to “resource” and allowing pretty much arbitrary related online artefacts, including documentation and the like.
>>>>>
>>>>> Best,
>>>>> Richard
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>> --
>>
>>
>> Phil Archer
>> W3C eGovernment
>> http://www.w3.org/egov/
>>
>> http://philarcher.org
>> +44 (0)7887 767755
>> @philarcher1
>>
>
>
>

-- 


Phil Archer
W3C eGovernment
http://www.w3.org/egov/

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Thursday, 25 October 2012 14:08:38 UTC