Re: Overlap between DCAT and ADMS

I'm not sure why but this redirection got held for Moderator Action.   
I'll just quote it here:

On 10/25/2012 09:47 AM, Gofran wrote:
> "redirecting to the mailing list"
> On 23 Oct 2012, at 18:16, Phil Archer wrote:
>> I have a lot of sympathy with this.
>> When I first was tasked with creating the RDF schema for ADMS, I used 
>> a load of DCAT properties and only introduced a few new ones. It was 
>> the introduction of a third related vocab (called ADMS for Software, 
>> ADMS.SW, which is not on the GLD work list) that I came up with 
>> RADion. That may or may not have been a sensible idea but it seems to 
>> be in line with the sentiment here in that what we have are two 
>> similar vocabs. They're slightly different because the people that 
>> have created them have slightly different perspectives.
>> DCAT is not designed, for example, to describe 3 separate PDFs 
>> wrapped up in a zip file - ADMS is (among other things).
>> But lest we get too hung up, perhaps we can take a little step back. 
>> I've just made another couple of tweaks to the ADMS spec, RDF schema 
>> and namespace HTML doc in readiness for Thursday (all linked from the 
>> wiki).
>> ADMS defines 5 classes and a bunch of properties that for the most 
>> part have no direct parallel in DCAT. But... like DCAT, most 'ADMS 
>> data' is Dublin Core.
>> It's a difference of emphasis, a difference in approach and therefore 
>> a difference in what gets included and not included in the vocab.
>> I see a number of options:
>> 1. Spend time working to align DCAT and ADMS more closely (in which 
>> case RADion is either a help or a hindrance - if the latter we don't 
>> have to be bound by it). That *might* then lead to the kind of 
>> integration we've done with RegORG and ORG. That's probably the ideal 
>> but do we have time and and willingness? Also, very significant 
>> effort has already been expended in creating ADMS and 
>> ADMS.SW-compliant data.
>> 2. Recognise that the overlap is significant but not a huge problem 
>> in itself since so many of the properties used are from dcterms. The 
>> ones that aren't are, of course, the more specialised ones. Cross 
>> reference ADMS and DCAT and say "take your pick - and here's why you 
>> might choose one over the other." Gofran's e-mail about re-usability 
>> is helpful here I think, as would be a short text highlighting the 
>> different approaches taken. I believe most potential users will feel 
>> more comfortable with one or the other and the choice is generally 
>> going to be made by repositories that harvest the data, not data 
>> publishers looking for an outlet for their data. As well as W3C we 
>> have national governments and fellow standards bodies publishing ADMS 
>> data (Denmark, OASIS, Open Metadata Registry, GS1...).
> I agree on this option , the overlap is significant and but also the 
> use case difference is obvious as well.
> RADion still seems to be a good idea to me to embrace this for now
>> 3. If the WG feels either route above is not right for a Rec Track 
>> document then we can publish ADMS as a WG Note and namespace doc and 
>> more or less leave it at that. As you can imagine, I'd rather not 
>> take this route but the WG is sovereign.
>> Phil.
>> On 21/10/2012 17:29, Richard Cyganiak wrote:
>>> Hi Gofran, hi Phil,
>>> I think the fundamental problem here is that we have two specs that 
>>> have a large overlap in scope, but neither is a subset of the other, 
>>> and *probably* neither can be easily extended to cover the other 
>>> without losing its focus.
>>> What is the overlap between both?
>>> Phil developed RADion in an attempt to “factor out” the overlap of 
>>> both specs: repositories, assets, distributions. But I think that 
>>> RADion fails to get to the essence of the overlap. It may be correct 
>>> on the repositories and assets, but fails with the distributions, or 
>>> at least has a conception of distribution that isn't sufficiently 
>>> generic to properly cover DCAT.
>>> I think the overlap of DCAT and ADMS is that both are catalogs of 
>>> metadata records designed for finding “assets” of some kind. They 
>>> differ, however, in the kind of assets that are listed in the 
>>> catalog, although there is overlap.
>>> Since the kinds of assets are different, there's a lot of difference 
>>> in the metadata that is required to adequately describe them, and in 
>>> the additional secondary concepts related to the asset, and in the 
>>> relationships that need to be recorded between assets, and in the 
>>> means of accessing the assets.
>>> I'm not suggesting any particular course of action as a result of 
>>> this observation. We should closely study the overlap between DCAT's 
>>> Catalog, Dataset, and CatalogRecord on the one hand, and ADMS' 
>>> Repository and Asset on the other hand, and also study their 
>>> relationships to things already out there.
>>> The more I think about it, the more I get worried that we're in the 
>>> process of not just reinventing the wheel, but reinventing it twice, 
>>> in parallel.
>>> Best,
>>> Richard
>>> On 20 Oct 2012, at 19:37, Gofran wrote:
>>>> Hi all,
>>>> The sets of resources that ADMS and DCAT describe are intersected, 
>>>> IMHO, I like to use the term "reusable" to point to the resources 
>>>> that ADMS describes and the set of reusable resources include (but 
>>>> not limited to) codelists, taxonomies, datasets ...etc as long as 
>>>> they can be reused in a diffrent context  and the ADMS purpose is 
>>>> to facilitate this by describing them using the right terms.
>>>> A dataset in for instance is basically a useful dataset 
>>>> for certain applications but it is not a reference dataset and DCAT 
>>>> should be used to describe it.
>>>> While a dataset about the languages in the EU is certainly more 
>>>> like a reusable asset that has broader usage base and ADMS and/or 
>>>> DCAT can be used to describe it (though it is not a "semantic" 
>>>> asset per se)
>>>> The problem as I see it , how to extend ADMS (or DCAT or both) to 
>>>> describe this difference .
>>>> On 20 Oct 2012, at 14:17, Richard Cyganiak wrote:
>>>>> Hi Phil,
>>>>> On 18 Oct 2012, at 18:33, Phil Archer wrote:
>>>>>> "ADMS, the Asset Description Metadata Schema, is a vocabulary for 
>>>>>> describing so-called Semantic Assets, that is, things like 
>>>>>> standards, code lists and taxonomies. Although it has a lot in 
>>>>>> common with the Data Catalog vocabulary [DCAT], notably the 
>>>>>> extensive use of Dublin Core [DC11], someone searching for a 
>>>>>> Semantic Asset is likely to have different needs, priorities and 
>>>>>> expectations than someone searching for a data set and these 
>>>>>> differences are reflected in ADMS. In particular, users seeking a 
>>>>>> Semantic Asset are likely to be searching for 'a document' — 
>>>>>> something they can open and read using familiar desktop software, 
>>>>>> as opposed to something that needs to be processed. Of course 
>>>>>> this is a very broad generalization. If a code list is published 
>>>>>> as a SKOS Concept scheme then it is both a Semantic Asset and a 
>>>>>> dataset and it can be argued that all Semantic Assets are 
>>>>>> datasets. Therefore the difference in /user expectation/ is at 
>>>>>> the heart of what distinguishes ADMS from DCAT."
>>>>> I have a number of issues with this.
>>>>> 1. You describe the purpose of ADMS as: “It's for describing 
>>>>> things like standards, code lists and taxonomies.” This is too 
>>>>> fuzzy. You can't have weasel words such as “like” in the sentence 
>>>>> that states the purpose of a technology. Law texts are a bit like 
>>>>> standards, right? So ADMS is for describing them too?
>>>>> 2. The text implies that the kinds of things described in DCAT 
>>>>> cannot be “open and read using familiar desktop software”. This is 
>>>>> not the case. In most data catalogs, the most common formats are 
>>>>> CSV and Excel.
>>>>> 3. It is not particularly likely that code lists and taxonomies -- 
>>>>> things that ADMS is intended to describe -- can be opened and read 
>>>>> in familiar desktop software.
>>>>> 4. If the main difference is indeed one of user expectation and 
>>>>> not one of vocabulary semantics, then a catalog-level flag in DCAT 
>>>>> might be sufficient to eliminate the need for ADMS. Surely it is 
>>>>> not that easy. So I don't feel that the text above gets to the 
>>>>> heart of the difference between DCAT and ADMS.
>>>>> 5. It is somewhat open whether the “distributions” in DCAT are all 
>>>>> machine-readable. There is an open DCAT issue about renaming 
>>>>> “distribution” to “resource” and allowing pretty much arbitrary 
>>>>> related online artefacts, including documentation and the like.
>>>>> Best,
>>>>> Richard
>> -- 
>> Phil Archer
>> W3C eGovernment
>> +44 (0)7887 767755
>> @philarcher1

Received on Thursday, 25 October 2012 13:51:31 UTC