Note on caveats in statistical data

Hi. I thought https://www.youtube.com/watch?v=cLMbrzI5p6s might be of
interest to the WG. It's a 30 second video from a chat today at Full Fact
(UK fact checking charity), with Andy Dudfield from the UK's Office for
National Statistics. Andy, Will Moy, Mevan Babakar and I discussed the
importance of making sure that caveats of various kinds travel along with
the different data format representations of statistical data. Full Fact
have done some work in this direction and would be interested in
conversations on how it might plug into standards (e.g. CSVW, DCAT,
Schema.org etc).

I've also just transcribed the video, so here's the text version:

(Will Moy) "[re statistical data]... full of numbers, ... what I want to go
along with that is a list of things I need to know about those numbers in
order to be able to re-use them. And I want those to be organized so
instead of just getting a long list of footnotes, those footnotes are
classified into the type of caveat it is. So we did a piece of work which
is what kind of caveats exist. So - is it an anomalous data point or is it
that we changed the methodology or whatever, ... classify it that way, in a
machine readable way using a standardized code list so a computer has a
reasonable chance of being able to reason about what those numbers can do."

I'll share more details of this work as I find out more but it seemed worth
making a quick note first.

cheers,

Dan

Received on Thursday, 13 July 2017 23:05:08 UTC