W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > March 2016

Re: DQV, ISO 19115/19157 and GeoDCAT-AP - metadata on metadata quality

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Mon, 7 Mar 2016 11:45:17 +0100
Message-ID: <56DD5BBD.9070705@few.vu.nl>
To: Andrea Perego <andrea.perego@jrc.ec.europa.eu>
CC: "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Hi Andrea,

Trying to wrap-up this one:


>> 4. The example of conformance is triggering a discussion on how to
>> indicate that it's fine to use DQV to indicate the quality of metadata,
>> not only 'original' datasets. This is less relevant to your original
>> comment, but you may want to chime in.
>
> This is actually one of the issues my colleagues and me have been dealing with in our work. So, +1 from me.
>
> I can contribute an existing example from the INSPIRE Geoportal.
>
> The INSPIRE Geoportal is harvesting metadata records (100K+) from catalogue services operated by EU Member States. As a post-processing step, the geoportal infrastructure carries out a validation test against a set of criteria, and it generates a validation report.
>
> This procedure is not meant to decide whether a metadata record can be published or not. All records are published, irrespective of whether they have passed the validation. Rather, these reports are meant to provide (meta)data providers precise information on the issues identified.
>
> Notably, this approach proved to be effective in dramatically increasing the quality of metadata (currently, valid metadata are, in average, more than 90% of those harvested). But this is a process that needs to be carried out not only when a new metadata provider joins, but also whenever metadata records are re-harvested. In other words, this needs to be integral part of the metadata management workflow.
>
> This experience also provides an example of atomic/aggregated quality checks, in particular along a temporal dimension. E.g., this applies to the links included in metadata records, that can point to distributions, services for data access and/or visualisation. In such a case, the validation results require aggregating link / service check results over a given time frame (e.g., 24h, one week).
>
> So, IMO, also for metadata both the scenarios described earlier apply.
>

These are useful points. I have added a scoping note in the intro for DQV, hoping it captures that it is possible to use DQV to express statements about the quality of metadata itself:
http://w3c.github.io/dwbp/vocab-dqg.html#intro

I hope it will be enough. As INSPIRE Geoportal is not in our original use cases, it may be a bit tricky to add a reference to it, without adding a complete example, which could be too much in the current DQV document.

Note that we could also add a note about this in the Best Practices document, for example when we talk about quality of data:
http://w3c.github.io/dwbp/bp.html#quality

My issue is that if we do this, then perhaps all best practices that are stated to apply to the data should also be stated for the metadata.

Well, actually I have tried to do this for the two BPs I curate:
http://w3c.github.io/dwbp/bp.html#dataVocabularies ("Use standardized terms") and ("Reuse vocabularies")
But fitting this aspect in the entire document properly is way beyond what I can offer myself here!

Cheers,

Antoine
Received on Monday, 7 March 2016 10:45:48 UTC

This archive was generated by hypermail 2.3.1 : Monday, 7 March 2016 10:45:49 UTC