W3C home > Mailing lists > Public > public-dwbp-comments@w3.org > August 2016

Re: Additions in DQV

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Wed, 24 Aug 2016 17:34:19 +0200
Message-ID: <57BDBE7B.10609@few.vu.nl>
To: Amrapali Jyotindra Zaveri <amrapali@stanford.edu>, public-dwbp-comments <public-dwbp-comments@w3.org>
CC: Anisa Rula <anisa.rula@gmail.com>
Dear Amrapali, Anisa,

Thanks a lot for your comments! It's really great to have more feedback on DQV.

Here are our reactions. Your feedback is welcome, especially on the suggestions to add some clarifying text into the document.
By the way with a recent word from the W3C director our schedule has accelerated. We need to have a vote on it by the Working Group on Friday.
We believe we can fit some additional sentences in the doc to address your concerns, as long as we don't change the DQV vocabulary itself but rather indicate ways to use DQV with other vocabularies, or indicate issues for future work (which I think is the most appropriate answer to your comments, anyway!)


> - specifying in which part of the dataset the quality issue is present
> ex:observ dqv:position _:node1.
> _:node1  a rdf:Statement ;
> rdf:subject <http://.../address/xx> ;
> rdf:predicate <http://schema.org/addressStreet> ;
> rdf:object "Chams-Elyseee 48” .
>


This is a very valid concern.
At this point, however, we don't have enough experience to make a judgement on which option would be good for it, I'm afraid.
There is a basic option that could be used: create an RDF graph for the statement, and say that the graph has the same measurement as the dataset of wider granularity. Assuming ex:originalDataset is the dataset that has been evaluated, we could have:
[
ex:originalDataset dqv:hasQualityMeasurement ex:observ .

:faultyStatementRDFGraph { <http://.../address/xx> <http://schema.org/addressStreet> "Chams-Elyseee 48" . }

:faultyStatementRDFGraph dqv:hasQualityMeasurement ex:observ .
]

We'd be happy to add a note about such a suggestion in the doc, if you think it could be a good point to make. At least for explaining that there are several interesting options on the table, and thus we have reasons for postponing the issue.


>
> - adding further information such as the type of quality issue by defining a new property for example dqv:severity (with values error,  warning, improvement, etc.)
> ex:observ dqv:severity <http://…/warning <http://�/warning>>



This is an interesting suggestion. I think DQV can't have this now, it would go too far in the assumptions about application needs.
In fact some more specific frameworks like SHACL are thinking of modeling severity levels for data checking:
https://www.w3.org/TR/shacl/#results-severity
The problem is that SHACL is not yet ready, so we can't make firm recommendations. We'd have to defer to future work, and note this in our 'wishlist' at https://www.w3.org/2013/dwbp/wiki/Main_Page#Wish_List


>
> - adding additional information such as describing the problem
> ex:observ dcterms:description “The triples in the dataset are outdated”
>


We're looking forward to hearing from you!

Kind regards,

Antoine and Riccardo
Received on Wednesday, 24 August 2016 15:37:54 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 24 August 2016 15:37:59 UTC