Re: data quality vocabulary - scope

Hi Annette, everyone,

Trying to reflect the broader range of some of Annette's comments...


> (Credibility [could be a statement], as the decision of who is authoritative is subjective, and I wouldn't want to rate a good dataset poorly because it was put out by a small business.)
>


Some may disagree with this, which leads to the following point.


> I think we also need to think about how this vocabulary is expected to be used. If a data publisher provides quality information upon publication (which is what I've been thinking of as the main use of the vocabulary), then some items won't really make sense to include. Information like why the data was removed from the web will not be available when it is first published.


DQV is meant to be used by the data publisher, but also by anyone who could have a say on the quality, after publication.
If no one has a strong objection about this scope, and the DQV draft is not clear about it, then we should make it clearer.

Note that this is the main reason why there's potential overlap with DUV (ie., for feedback). And why we're keen to have provenance of quality MD (so that someone could judge the credibility of a credibility statement).



> I worry, too, that we are defining some stuff that really isn't about data quality so much as the best practices that we have in the BP doc. I'm thinking here of the mentions of machine readability and metadata.
> We may need to do some scoping to be sure we are targeting quality information. I would suggest that we avoid repeating what is in the BP doc.


(I am not sure that I'm fully understanding your worry here, but I'll try my interpretation...)

I have repeatedly asked for such scoping discussion, since we gathered all these preparatory wiki pages of the last F2F.
Especially, whether DQV should include (or even focus) on quality criteria that would be defined by the group as part of the BPs.

I think there was never a clear answer. Well, it seemed to me that at one point people seemed to like the idea of using DQV to measure how well dataset comply with our BPs. But maybe this is a personal interpretation of discussions I was not physically in.

Personally I'm a bit lukewarm on the idea. Even though I must admit that defining the "star system" (that has been mentioned a couple of times before) using DQV looks like an interesting exercise of eating one's own dog food.

What do you think?

Antoine

Received on Tuesday, 2 June 2015 22:25:24 UTC