W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > September 2015

Re: DQV ISSUE-187 - cardinality of the link between Metric and Dimension

From: Debattista, Jeremy <Jeremy.Debattista@iais.fraunhofer.de>
Date: Fri, 11 Sep 2015 19:16:27 +0000
To: Antoine Isaac <aisaac@few.vu.nl>
CC: Public DWBP WG <public-dwbp-wg@w3.org>
Message-ID: <D714F039-DFC0-46E1-B8F3-CAA595CB81C2@iais.fraunhofer.de>
Hi Antoine,

Whilst I agree that a metric can be in one or more dimensions (similarly for dimensions), Im inclined to favour otherwise. In a paper we are submitting about daQ (which I can share later), I wrote the following wrt the restrictions:

Whilst we acknowledge that a metric might be perceived in one or more categories by different parties, the main goal of the daQ model is to create a generic schema that allows the semantification of quality measures in the abstract three level hierarchy. When defining these quality measures, a common understanding is required (the unified view presented by the survey in [1] is a good example for this), such that these measures can be used by any framework without any prejudice. If such restrictions were not in place, doubt and ambiguity might be created between the consumers and the quality assessors (who could be a data enthusiast or the publisher himself). One fundamental aspect of assessing a dataset for its quality is to start eliminating the doubts consumers may have about the data. Therefore, if the same metric has multiple definitions and categorisations, incertitude is created for the consumer. On the other hand, the quality assessor ends up in an ambiguous situation to try to identify the right quality measure structure.

[1] http://www.semantic-web-journal.net/content/quality-assessment-linked-data-survey

As I said, this is just my opinion on this subject. Therefore, if you think that this restriction is too much for data publishers, it should be offered as a guideline. The issue is that in my opinion data quality will impact more data consumers than the publishers themselves. For publishers data quality will act as an indicator of the quality of their dataset, whilst for consumers data quality will help them choose one dataset over the other.
Imagine the scenario that a Metric X could be categorised in dimension A or B. If one dataset is using X that is in dimension A whilst another dataset is using X in dimension B, it will make it hard to compare the two datasets based on the same metric X.

In this paper we formalise these restrictions in DL, but they are not formally defined in the schema.

I hope my opinion helps a bit :)


On 11 Sep 2015, at 18:01, Antoine Isaac <aisaac@few.vu.nl<mailto:aisaac@few.vu.nl>> wrote:

Hi Jeremy,

Today we had a call during which we discussed ISSUE-187:
I've tried to capture the gist of the discussion in the notes attached
It would be great to have your opinion about it!


Received on Friday, 11 September 2015 19:17:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:39:41 UTC