Re: Overlap between DCAT and ADMS

I have a lot of sympathy with this.

When I first was tasked with creating the RDF schema for ADMS, I used a 
load of DCAT properties and only introduced a few new ones. It was the 
introduction of a third related vocab (called ADMS for Software, 
ADMS.SW, which is not on the GLD work list) that I came up with RADion. 
That may or may not have been a sensible idea but it seems to be in line 
with the sentiment here in that what we have are two similar vocabs. 
They're slightly different because the people that have created them 
have slightly different perspectives.

DCAT is not designed, for example, to describe 3 separate PDFs wrapped 
up in a zip file - ADMS is (among other things).

But lest we get too hung up, perhaps we can take a little step back. 
I've just made another couple of tweaks to the ADMS spec, RDF schema and 
namespace HTML doc in readiness for Thursday (all linked from the wiki).

ADMS defines 5 classes and a bunch of properties that for the most part 
have no direct parallel in DCAT. But... like DCAT, most 'ADMS data' is 
Dublin Core.

It's a difference of emphasis, a difference in approach and therefore a 
difference in what gets included and not included in the vocab.

I see a number of options:

1. Spend time working to align DCAT and ADMS more closely (in which case 
RADion is either a help or a hindrance - if the latter we don't have to 
be bound by it). That *might* then lead to the kind of integration we've 
done with RegORG and ORG. That's probably the ideal but do we have time 
and and willingness? Also, very significant effort has already been 
expended in creating ADMS and ADMS.SW-compliant data.

2. Recognise that the overlap is significant but not a huge problem in 
itself since so many of the properties used are from dcterms. The ones 
that aren't are, of course, the more specialised ones. Cross reference 
ADMS and DCAT and say "take your pick - and here's why you might choose 
one over the other." Gofran's e-mail about re-usability is helpful here 
I think, as would be a short text highlighting the different approaches 
taken. I believe most potential users will feel more comfortable with 
one or the other and the choice is generally going to be made by 
repositories that harvest the data, not data publishers looking for an 
outlet for their data. As well as W3C we have national governments and 
fellow standards bodies publishing ADMS data (Denmark, OASIS, Open 
Metadata Registry, GS1...).

3. If the WG feels either route above is not right for a Rec Track 
document then we can publish ADMS as a WG Note and namespace doc and 
more or less leave it at that. As you can imagine, I'd rather not take 
this route but the WG is sovereign.

Phil.




On 21/10/2012 17:29, Richard Cyganiak wrote:
> Hi Gofran, hi Phil,
>
> I think the fundamental problem here is that we have two specs that have a large overlap in scope, but neither is a subset of the other, and *probably* neither can be easily extended to cover the other without losing its focus.
>
> What is the overlap between both?
>
> Phil developed RADion in an attempt to “factor out” the overlap of both specs: repositories, assets, distributions. But I think that RADion fails to get to the essence of the overlap. It may be correct on the repositories and assets, but fails with the distributions, or at least has a conception of distribution that isn't sufficiently generic to properly cover DCAT.
>
> I think the overlap of DCAT and ADMS is that both are catalogs of metadata records designed for finding “assets” of some kind. They differ, however, in the kind of assets that are listed in the catalog, although there is overlap.
>
> Since the kinds of assets are different, there's a lot of difference in the metadata that is required to adequately describe them, and in the additional secondary concepts related to the asset, and in the relationships that need to be recorded between assets, and in the means of accessing the assets.
>
> I'm not suggesting any particular course of action as a result of this observation. We should closely study the overlap between DCAT's Catalog, Dataset, and CatalogRecord on the one hand, and ADMS' Repository and Asset on the other hand, and also study their relationships to things already out there.
>
> The more I think about it, the more I get worried that we're in the process of not just reinventing the wheel, but reinventing it twice, in parallel.
>
> Best,
> Richard
>
>
> On 20 Oct 2012, at 19:37, Gofran wrote:
>
>> Hi all,
>>
>> The sets of resources that ADMS and DCAT describe are intersected, IMHO, I like to use the term "reusable" to point to the resources that ADMS describes and the set of reusable resources include (but not limited to) codelists, taxonomies, datasets ...etc as long as they can be reused in a diffrent context  and the ADMS purpose is to facilitate this by describing them using the right terms.
>>
>> A dataset in data.gov.uk for instance is basically a useful dataset for certain applications but it is not a reference dataset and DCAT should be used to describe it.
>> While a dataset about the languages in the EU is certainly more like a reusable asset that has broader usage base and ADMS and/or DCAT can be used to describe it (though it is not a "semantic" asset per se)
>>
>> The problem as I see it , how to extend ADMS (or DCAT or both) to describe this difference .
>>
>> On 20 Oct 2012, at 14:17, Richard Cyganiak wrote:
>>
>>> Hi Phil,
>>>
>>> On 18 Oct 2012, at 18:33, Phil Archer wrote:
>>>> "ADMS, the Asset Description Metadata Schema, is a vocabulary for describing so-called Semantic Assets, that is, things like standards, code lists and taxonomies. Although it has a lot in common with the Data Catalog vocabulary [DCAT], notably the extensive use of Dublin Core [DC11], someone searching for a Semantic Asset is likely to have different needs, priorities and expectations than someone searching for a data set and these differences are reflected in ADMS. In particular, users seeking a Semantic Asset are likely to be searching for 'a document' — something they can open and read using familiar desktop software, as opposed to something that needs to be processed. Of course this is a very broad generalization. If a code list is published as a SKOS Concept scheme then it is both a Semantic Asset and a dataset and it can be argued that all Semantic Assets are datasets. Therefore the difference in /user expectation/ is at the heart of what distinguishes ADMS from DCAT."
>>>
>>> I have a number of issues with this.
>>>
>>> 1. You describe the purpose of ADMS as: “It's for describing things like standards, code lists and taxonomies.” This is too fuzzy. You can't have weasel words such as “like” in the sentence that states the purpose of a technology. Law texts are a bit like standards, right? So ADMS is for describing them too?
>>>
>>> 2. The text implies that the kinds of things described in DCAT cannot be “open and read using familiar desktop software”. This is not the case. In most data catalogs, the most common formats are CSV and Excel.
>>>
>>> 3. It is not particularly likely that code lists and taxonomies -- things that ADMS is intended to describe -- can be opened and read in familiar desktop software.
>>>
>>> 4. If the main difference is indeed one of user expectation and not one of vocabulary semantics, then a catalog-level flag in DCAT might be sufficient to eliminate the need for ADMS. Surely it is not that easy. So I don't feel that the text above gets to the heart of the difference between DCAT and ADMS.
>>>
>>> 5. It is somewhat open whether the “distributions” in DCAT are all machine-readable. There is an open DCAT issue about renaming “distribution” to “resource” and allowing pretty much arbitrary related online artefacts, including documentation and the like.
>>>
>>> Best,
>>> Richard
>>>
>>>
>>
>
>
>

-- 


Phil Archer
W3C eGovernment
http://www.w3.org/egov/

http://philarcher.org
+44 (0)7887 767755
@philarcher1

Received on Tuesday, 23 October 2012 17:16:29 UTC