Re: Data Quality and Granularity vocabulary - preliminary report

Hmm and now I see that there is no call next week.
We will send a report anyway - Friday is not a holiday for the four of us I believe.
*But* it would be great if the group could pay attention to and discuss our emails. This will be crucial for the F2F discussion!

Note that I can't make it for the F2F, and know that at least Christophe can't either, so the discussion will need some other motivated people for the F2F...

Antoine

On 3/29/15 4:53 PM, Antoine Isaac wrote:
> Dear all,
>
> Here's a report on the work for Quality and Granularity (Q&G) vocabulary
>
> We have started extracting requirements from the best practices:
> https://www.w3.org/2013/dwbp/wiki/Requirements_From_FPWD_BP
>
> There are three categories: "infrastructure" requirements, requirements on the process of designing and publishing the Q&G voc, and finally requirements that tell us which kind of information needs the Q&G voc should answer.
>
> The last category is the most important to for us now, as it will dictate which classes and properties we should have in the Q&G voc.
> However, we feel we have little material to work on. Riccardo has done a great work identifying 'competency questions' (which match the idea of 'concrete requirements' in our schedule at [1]). But he had to be very 'creative' - most of the question come from him, not from the best practices in the WD.
>
>
> A second stream of work is the extraction of anything relevant for Q&G from our document on Use Cases and Requirements:
> https://www.w3.org/2013/dwbp/wiki/Quality_Requirements_From_UCR
>
> Two main results here:
> - trying to assess which requirements should be in scope for the Q&G work
> https://www.w3.org/2013/dwbp/wiki/Requirements_In_Scope_For_Quality
>
> - extracting the relevant Q&G stuff from the descriptions of Use Cases
> https://www.w3.org/2013/dwbp/wiki/Quality_Aspects_In_Use_Cases
>
> This work raises first a scoping issue. The UCR WD lists a handful of requirements as relevant for Q&G. But the owners of use cases tend to relate Q&G to a much broader set of aspects. The biggest question here is whether the Q&G voc should serve to express how well a dataset implements some best practices in our BP WD. If yes, then the Q&G voc will have to cover a wide set of competency questions.
>
> The second issue is whether the Q&G voc should enable expressing specific quality metrics. It is clear that Q&G voc will bring a framework to express metrics for data quality along specific quality dimensions (e.g. 'completeness'), and exchange the results for datasets for these metrics. But this doesn't say that the Q&G vocabulary should itself define the specific metrics (e.g. 'precision/recall of statements').
>
>
> In fact the analysis of the Use Cases confirms the observation we made for the BPs: we have not much material to define competency questions for the Q&G voc.
>
> So we have this idea of coming back to the use case owners with a questionnaire trying to extract more specific quality aspects than the ones we have in the Use Case WD.
> The first (partial) draft is at
> https://www.w3.org/2013/dwbp/wiki/QualityQuestionnaire
>
>
> We are still following the original schedule [1] and we will send another report before this week's telecon.
> However I expect the report will be very similar to this mail, unless the group gives some substantial feedback on the way for us to move forward, in the meantime!
>
> Best,
>
> Antoine, on behalf of Riccardo, Deirdre and Christophe.
>
> [1] https://www.w3.org/2013/dwbp/wiki/Data_quality_schedule
>
>

Received on Sunday, 29 March 2015 14:58:46 UTC