W3C home > Mailing lists > Public > public-dxwg-wg@w3.org > July 2017

Re: Note on caveats in statistical data

From: Dan Brickley <danbri@google.com>
Date: Fri, 14 Jul 2017 14:51:07 +0100
Message-ID: <CAK-qy=7UQpgB29f3xi1BJH69Ja7hCCdKWgr7vx8yHK_p69hQZQ@mail.gmail.com>
To: Riccardo Albertoni <albertoni@ge.imati.cnr.it>
Cc: Will Moy <william.moy@fullfact.org>, Antoine Isaac <aisaac@few.vu.nl>, Makx Dekkers <mail@makxdekkers.com>, Dataset Exchange Working Group <public-dxwg-wg@w3.org>
On 14 Jul 2017 11:53 am, "Riccardo Albertoni" <albertoni@ge.imati.cnr.it>
wrote:

Dear Makx, Will, Dan and All

I agree with Makx,  the DQV[1] could be handy to add caveats in such a
context.
I think that the granularity of described resources is no real barrier to
DQV adoption,  in case the DQV makes sense in the Will Moy's scenario.
The textual definition of dqv:QualityAnnotation [2] refers to datasets and
distributions because they were the primary targets in the W3C DWBP group.
However,  we have deliberately chosen to leave DQV open for being reused
with anything else (e.g., we haven't  imposed any formal constraints to say
that annotated objects had to be instances of dcat:Dataset/Distribution).


Inreresting! Here are some more details from Full Fact, which might be
enough to try out that idea.

https://fullfact.org/blog/2015/aug/typology-caveats/

Details are in a (pretty SKOS-like) spreadsheet.

https://fullfact.org/media/redactor/Typology.xlsx

...including patial SDMX mappings - I haven't figured out yet what that
might mean for a W3C Data Cube representation

Dan



Cheers,
Riccardo

[1] https://www.w3.org/TR/vocab-dqv/
[2] https://www.w3.org/TR/vocab-dqv/#dqv:QualityAnnotation
[3] https://www.w3.org/TR/annotation-vocab/#annotation

On 14 July 2017 at 09:13, Makx Dekkers <mail@makxdekkers.com> wrote:

> It seems to me that the mention of “an anomalous data point” in the
> transcript implies that they are interested to annotate down to the level
> of individual observations, for example, qb:Observation.
>
>
>
> So, they may need to look at a vocabulary like Data Cube to see how such
> annotations could be included. Maybe dqv:QualityAnnotation
> https://www.w3.org/TR/vocab-dqv/#dqv:QualityAnnotation could help, but
> that is defined on the level of dataset, not for individual observations,
> if I read it right.
>
>
>
> The statistical people themselves are doing stuff around XKOS with
> Explanatory notes, see http://www.ddialliance.org/Spe
> cification/XKOS/1.0/OWL/xkos.html#note-ext.
>
>
>
> Makx.
>
>
>
>
>
>
>
> *From:* Dan Brickley [mailto:danbri@google.com]
> *Sent:* 14 July 2017 01:05
> *To:* public-dxwg-wg@w3.org
> *Subject:* Note on caveats in statistical data
>
>
>
> Hi. I thought https://www.youtube.com/watch?v=cLMbrzI5p6s might be of
> interest to the WG. It's a 30 second video from a chat today at Full Fact
> (UK fact checking charity), with Andy Dudfield from the UK's Office for
> National Statistics. Andy, Will Moy, Mevan Babakar and I discussed the
> importance of making sure that caveats of various kinds travel along with
> the different data format representations of statistical data. Full Fact
> have done some work in this direction and would be interested in
> conversations on how it might plug into standards (e.g. CSVW, DCAT,
> Schema.org etc).
>
>
>
> I've also just transcribed the video, so here's the text version:
>
>
>
> (Will Moy) "[re statistical data]... full of numbers, ... what I want to
> go along with that is a list of things I need to know about those numbers
> in order to be able to re-use them. And I want those to be organized so
> instead of just getting a long list of footnotes, those footnotes are
> classified into the type of caveat it is. So we did a piece of work which
> is what kind of caveats exist. So - is it an anomalous data point or is
> it that we changed the methodology or whatever, ... classify it that way,
> in a machine readable way using a standardized code list so a computer has
> a reasonable chance of being able to reason about what those numbers can
> do."
>
>
>
> I'll share more details of this work as I find out more but it seemed
> worth making a quick note first.
>
>
>
> cheers,
>
>
>
> Dan
>



-- 
----------------------------------------------------------------------------
Riccardo Albertoni
Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
Magenes"
Consiglio Nazionale delle Ricerche
via de Marini 6 - 16149 GENOVA - ITALIA
tel. +39-010-6475624 <+39%20010%20647%205624> - fax +39-010-6475660
<+39%20010%20647%205660>
e-mail: Riccardo.Albertoni@ge.imati.cnr.it
Skype: callto://riccardoalbertoni/
LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
www: *http://www.imati.cnr.it/ <http://www.imati.cnr.it/>*
http://pers.ge.imati.cnr.it/albertoni/PersonalPage/albertoni.html
FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
Received on Friday, 14 July 2017 13:51:37 UTC

This archive was generated by hypermail 2.3.1 : Monday, 25 March 2019 10:33:19 UTC