Re: DQV ISSUE-187 - cardinality of the link between Metric and Dimension from Antoine Isaac on 2015-09-13 (public-dwbp-wg@w3.org from September 2015)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Sun, 13 Sep 2015 19:12:38 +0200
To: "Debattista, Jeremy" <Jeremy.Debattista@iais.fraunhofer.de>
CC: Public DWBP WG <public-dwbp-wg@w3.org>
Message-ID: <55F5AE86.2050105@few.vu.nl>
Hi Jeremy,

Thanks a lot for the quick reaction. This is really helpful!

I understand the will to rationalize the expression of quality metadata. So whatever happen, I'm really keep to take your suggestion onboard, about giving strong guidelines to publishers of quality metadata. We could recommend that publishers of quality metadata (and the designers of quality metadata frameworks instantiating our DQV) SHOULD aim at creating the least ambiguous descriptions, and thus seek to assign only one dimension to a metric.

However I am extremely reluctant to go more formal about it, because:

1. I am not certain we can rule out the possibility of frameworks that are made of non-disjoint dimensions but are nonetheless very useful. For example if a SPARQL endpoint has terrible uptime (say, only 50%), I'd be tempted to say that this is both a performance and an availability issue.
Nandana said he had thought of examples, I hope he can provide some.

2. Our DQV doesn't define specific metrics and dimensions. So it's likely that several frameworks would emerge, with dimensions that could be 'overlapping'. Say, where one defines 'performance',  'availability' and 'licensing' as three dimensions, another defines only a vaguer dimension of 'access'. Still the one framework could have metrics that are useful from the perspective of the other. If we allow a metric to be assigned different dimensions so that they can serve different frameworks, it would greatly enhance interoperability, especially the re-use of quality assessments across frameworks.

I'm curious to hear what's everyone opinion about it!

Cheers,

Antoine

On 9/11/15 9:16 PM, Debattista, Jeremy wrote:
> Hi Antoine,
>
> Whilst I agree that a metric can be in one or more dimensions (similarly for dimensions), I’m inclined to favour otherwise. In a paper we are submitting about daQ (which I can share later), I wrote the following wrt the restrictions:
>
> ———
>
> Whilst we acknowledge that a metric might be perceived in one or more categories by different parties, the main goal of the daQ model is to create a generic schema that allows the semantification of quality measures in the abstract three level hierarchy. When defining these quality measures, a common understanding is required (the unified view presented by the survey in [1] is a good example for this), such that these measures can be used by any framework without any prejudice. If such restrictions were not in place, /doubt/ and /ambiguity/ might be created between the consumers and the quality assessors (who could be a data enthusiast or the publisher himself). One fundamental aspect of assessing a dataset for its quality is to start eliminating the doubts consumers may have about the data. Therefore, if the same metric has multiple definitions and categorisations, incertitude is created for the consumer. On the other hand, the quality assessor ends up in an ambiguous
> situation to try to identify the right quality measure structure.
>
> [1] http://www.semantic-web-journal.net/content/quality-assessment-linked-data-survey
>
> ———
>
> As I said, this is just my opinion on this subject. Therefore, if you think that this restriction is too much for data publishers, it should be offered as a guideline. The issue is that in my opinion data quality will impact more data consumers than the publishers themselves. For publishers data quality will act as an indicator of the quality of their dataset, whilst for consumers data quality will help them choose one dataset over the other.
> Imagine the scenario that a Metric X could be categorised in dimension A or B. If one dataset is using X that is in dimension A whilst another dataset is using X in dimension B, it will make it hard to compare the two datasets based on the same metric X.
>
> In this paper we formalise these restrictions in DL, but they are not formally defined in the schema.
>
> I hope my opinion helps a bit :)
>
> Cheers,
> Jer
>
> On 11 Sep 2015, at 18:01, Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>> wrote:
>
>> Hi Jeremy,
>>
>> Today we had a call during which we discussed ISSUE-187:
>> https://www.w3.org/2013/dwbp/track/issues/187
>> I've tried to capture the gist of the discussion in the notes attached
>> It would be great to have your opinion about it!
>>
>> Cheers,
>> Antoine
>
Received on Sunday, 13 September 2015 17:13:09 UTC