Re: DQV ISSUE-187 - cardinality of the link between Metric and Dimension from Debattista, Jeremy on 2015-09-11 (public-dwbp-wg@w3.org from September 2015)

From: Debattista, Jeremy <Jeremy.Debattista@iais.fraunhofer.de>
Date: Fri, 11 Sep 2015 19:16:27 +0000
To: Antoine Isaac <aisaac@few.vu.nl>
CC: Public DWBP WG <public-dwbp-wg@w3.org>
Message-ID: <D714F039-DFC0-46E1-B8F3-CAA595CB81C2@iais.fraunhofer.de>

Hi Antoine,

Whilst I agree that a metric can be in one or more dimensions (similarly for dimensions), I’m inclined to favour otherwise. In a paper we are submitting about daQ (which I can share later), I wrote the following wrt the restrictions:

———

Whilst we acknowledge that a metric might be perceived in one or more categories by different parties, the main goal of the daQ model is to create a generic schema that allows the semantification of quality measures in the abstract three level hierarchy. When defining these quality measures, a common understanding is required (the unified view presented by the survey in [1] is a good example for this), such that these measures can be used by any framework without any prejudice. If such restrictions were not in place, doubt and ambiguity might be created between the consumers and the quality assessors (who could be a data enthusiast or the publisher himself). One fundamental aspect of assessing a dataset for its quality is to start eliminating the doubts consumers may have about the data. Therefore, if the same metric has multiple definitions and categorisations, incertitude is created for the consumer. On the other hand, the quality assessor ends up in an ambiguous situation to try to identify the right quality measure structure.

[1] http://www.semantic-web-journal.net/content/quality-assessment-linked-data-survey

———

As I said, this is just my opinion on this subject. Therefore, if you think that this restriction is too much for data publishers, it should be offered as a guideline. The issue is that in my opinion data quality will impact more data consumers than the publishers themselves. For publishers data quality will act as an indicator of the quality of their dataset, whilst for consumers data quality will help them choose one dataset over the other.
Imagine the scenario that a Metric X could be categorised in dimension A or B. If one dataset is using X that is in dimension A whilst another dataset is using X in dimension B, it will make it hard to compare the two datasets based on the same metric X.

In this paper we formalise these restrictions in DL, but they are not formally defined in the schema.

I hope my opinion helps a bit :)

Cheers,
Jer

On 11 Sep 2015, at 18:01, Antoine Isaac <aisaac@few.vu.nl<mailto:aisaac@few.vu.nl>> wrote:

Hi Jeremy,

Today we had a call during which we discussed ISSUE-187:
https://www.w3.org/2013/dwbp/track/issues/187
I've tried to capture the gist of the discussion in the notes attached
It would be great to have your opinion about it!

Cheers,
Antoine

Received on Friday, 11 September 2015 19:17:21 UTC