- From: Eric Stephan <ericphb@gmail.com>
- Date: Thu, 30 Oct 2014 07:10:19 -0700
- To: Steven Adler <adler1@us.ibm.com>
- Cc: Riccardo Albertoni <riccardo.albertoni@ge.imati.cnr.it>, Antoine Isaac <aisaac@few.vu.nl>, Bart van Leeuwen <bart_van_leeuwen@netage.nl>, "Debattista, Jeremy" <Jeremy.Debattista@iais-extern.fraunhofer.de>, Makx Dekkers <mail@makxdekkers.com>, Public DWBP WG <public-dwbp-wg@w3.org>, riccardo.imati@gmail.com
- Message-ID: <CAMFz4jj1XNF+T9pB76_BqLP-HOwyC7x8hiA-66O8U8Wo_zyy3g@mail.gmail.com>
+1 >> I recommend focusing on the details of data quality vocabularies and let vendors and community groups determine how they are tabulated into metrics. On Thu, Oct 30, 2014 at 6:50 AM, Steven Adler <adler1@us.ibm.com> wrote: > Metrics change human behavior with superficial focus of attainment of > desired factors instead of deeper understanding of underlying issues. We > all saw how this played out in banks prior to the Credit Crisis as CEO's > became obsessed with managing VAR (Value at Risk), even if most did not > understand how VAR was calculated. > > I recommend focusing on the details of data quality vocabularies and let > vendors and community groups determine how they are tabulated into metrics. > > > Best Regards, > > Steve > > Motto: "Do First, Think, Do it Again" > > [image: Inactive hide details for Riccardo Albertoni ---10/30/2014 > 06:37:59 AM---Hi All, I basically agree with Jeremy, I think we sho]Riccardo > Albertoni ---10/30/2014 06:37:59 AM---Hi All, I basically agree with > Jeremy, I think we should define how quality > > > > From: > > > Riccardo Albertoni <riccardo.albertoni@ge.imati.cnr.it> > > To: > > > Makx Dekkers <mail@makxdekkers.com> > > Cc: > > > "Debattista, Jeremy" <Jeremy.Debattista@iais-extern.fraunhofer.de>, Bart > van Leeuwen <bart_van_leeuwen@netage.nl>, Public DWBP WG < > public-dwbp-wg@w3.org>, Antoine Isaac <aisaac@few.vu.nl> > > Date: > > > 10/30/2014 06:37 AM > > Subject: > > > Re: Data quality and requirements - discussion for F2F? > ------------------------------ > > > > Hi All, > I basically agree with Jeremy, I think we should define how quality > metadata can be represented at an abstract level in a metadata model( e.g. > Ontology). In my opinion both human- focused information and metrics based > quality should be represented in the model provided that there are use > cases grounding these needs. > > In order to make quality of dataset comparable and objective, I think It > would be great to have a set of recommended metrics and quality dimensions, > even if I am not sure such a set can be easily identified. > > Anyway, If a set of metrics is going to be defined and "recommended" I > think that set should be extensible, as I tried to stress proposing the > LuSTRE use case and the Q-MetricExtensibility requirement in my e-mail last > week (see Quality requirements and a new use case for UCR [1] ). > > > Regards, > Riccardo > > [1] > *http://lists.w3.org/Archives/Public/public-dwbp-comments/2014Oct/0002.html* > <http://lists.w3.org/Archives/Public/public-dwbp-comments/2014Oct/0002.html> > > > On 30 October 2014 12:58, Makx Dekkers <*mail@makxdekkers.com* > <mail@makxdekkers.com>> wrote: > > As I am following this discussion, it occurred to me that maybe we > could look also at who will use any statements about and what for. > > On one hand, there is quality-related information that is for human > consumption, e.g. things like the information provided at > *http://www.legislation.gov.uk/help#aboutChangesToLeg* > <http://www.legislation.gov.uk/help#aboutChangesToLeg> and other FAQ > items on that page. Such information can be used by humans to take > decisions about whether they want to use the data. > > > > On the other hand, precise metrics may be used by programs to > pre-select collections of data, but in that case we need to understand > maybe a little bit more what kind of programs or applications would consume > the metrics and for what purpose. > > > > It seems to me that maybe the human- focused information is a little > easier to define (e.g. using the *legislation.gov.uk* > <http://legislation.gov.uk/> as a starting point). We could start to > define a small set of properties for those (either as text or using some > controlled vocabulary) and look at the metrics later on the basis of > existing applications that use quality metrics in practice. I agree that > metrics are not that easy to define, and probably also complex to use. > > > > Makx > > > > *De:* Debattista, Jeremy [mailto: > *Jeremy.Debattista@iais-extern.fraunhofer.de* > <Jeremy.Debattista@iais-extern.fraunhofer.de>] > * Enviado el:* jueves, 30 de octubre de 2014 11:11 > * Para:* Bart van Leeuwen > * CC:* Public DWBP WG; Antoine Isaac > * Asunto:* Re: Data quality and requirements - discussion for F2F? > > > > Hi Bart, Antoine > > > > I agree with both of you that defining a vocabulary based on metrics > is hard. From my work on data quality, I realised that different domains, > use cases etc might require different metrics. Of course, there are those > metrics that would be suitable for most of the use cases. What I found > useful was to define how quality metadata should be represented at an > abstract level [1]. Then based on this abstract ontology, we defined a > number of quality metrics [2], some of which might be similar to those > extracted from the DWBP use cases. On the whole, my opinion is that we have > to provide a pragmatic solution that would be suitable for everyone within > the community, i.e. in the future other interested parties should be able > to define quality metrics that can be easily interoperable with other > defined quality metrics. > > > > I would gladly join the F2F discussion remotely, if it won’t be after > 10pm (CET) :). > > > > Cheers, > > Jer > > > > > > [1] > *https://raw.githubusercontent.com/EIS-Bonn/Luzzu/master/luzzu-semantics/src/main/resources/vocabularies/daq/daq.trig* > <https://raw.githubusercontent.com/EIS-Bonn/Luzzu/master/luzzu-semantics/src/main/resources/vocabularies/daq/daq.trig> > > [2] > *https://raw.githubusercontent.com/diachron/quality/luzzu-integration/src/main/resources/vocabularies/dqm/dqm.trig* > <https://raw.githubusercontent.com/diachron/quality/luzzu-integration/src/main/resources/vocabularies/dqm/dqm.trig> > > > > On 29 Oct 2014, at 17:17, Bart van Leeuwen < > *bart_van_leeuwen@netage.nl* <bart_van_leeuwen@netage.nl>> wrote: > > > Hi Antoine, > > Last night I had a conversation with Bernadette on this topic which > ended up in a nice discussion. > I'm on the same page with you that I think the Quality vocabulary > is rather hard to define if we will focus on metrics. > > I Hope we have some good amount of time during the F2F to discuss > it. > > Met Vriendelijke Groet / With Kind Regards > Bart van Leeuwen > > ############################################################## > # twitter: @semanticfire > # *netage.nl* <http://netage.nl/> > # *http://netage.nl* <http://netage.nl/> > # Enschedepad 76 > # 1324 GJ Almere > # The Netherlands > # tel. *+31(0)36-5347479* <%2B31%280%2936-5347479> > ############################################################## > > > > From: Antoine Isaac <*aisaac@few.vu.nl* <aisaac@few.vu.nl>> > To: Public DWBP WG <*public-dwbp-wg@w3.org* > <public-dwbp-wg@w3.org>> > Date: 29-10-2014 17:07 > Subject: Data quality and requirements - discussion for F2F? > > ------------------------------ > > > > > Dear all, > > As a preparation to the F2F discussions on vocabularies, I have > checked the latest version of the UCR document [1]. The progress that has > been made on describing use cases and identifying requirements is > impressive. > In particular, it is great the categorization of requirements to > identify requirements most important for our vocabulary work, including the > one on quality and granularity [2]. > > Yet, I am still not sure of the scoping of the quality vocabulary. > I've looked at all requirements, one could say that many could impact the > scope of a vocabulary to be used to document quality. Some thoughts are on > a new wiki page [3]. I admittedly played the devil's advocate there, i.e. I > was very liberal when judging a requirement could impact quality and > granularity. But in fact when looking at what various UCs have to say about > quality, I am wondering whether I am the only one confused! I have compiled > a list of quotes from the UC descriptions [3], which shows that considering > all contributors, a very wide definition of quality is still on order. > > My wish for the F2F discussion would be that the group spend some > time going through the requirements, and discuss whether they should be in > scope of the vocabulary. > Or to put it in other words, decide whether the vocabulary should > include elements for documenting whether a dataset meet the considered > requirements, ie., there is metadata for data re-users to understand the > performance of the dataset against the requirements the group has > identified. > > A reminder, all kind of pointers for the quality work are gathered > at [4]. Including first vocabulary design by Phil. > > Best regards, > > Antoine > > [1] *http://www.w3.org/TR/2014/WD-dwbp-ucr-20141014/* > <http://www.w3.org/TR/2014/WD-dwbp-ucr-20141014/> > [2] > *http://www.w3.org/TR/dwbp-ucr/#requirements-for-quality-and-granularity-description-vocabulary* > <http://www.w3.org/TR/dwbp-ucr/#requirements-for-quality-and-granularity-description-vocabulary> > [3] *https://www.w3.org/2013/dwbp/wiki/UCRs_and_Quality* > <https://www.w3.org/2013/dwbp/wiki/UCRs_and_Quality> > [4] *https://www.w3.org/2013/dwbp/wiki/Data_quality_notes* > <https://www.w3.org/2013/dwbp/wiki/Data_quality_notes> > > > > > -- > This message was scanned by ESVA and is believed to be clean. > *Click to report as spam. Segnala come spam.* > <http://mailscanner.ge.cnr.it/cgi-bin/learn-msg.cgi?id=F420F28CEE.DFB6D> > > > > > > > -- > > ---------------------------------------------------------------------------- > Riccardo Albertoni > Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico > Magenes" > Consiglio Nazionale delle Ricerche > via de Marini 6 - 16149 GENOVA - ITALIA > tel. +39-010-6475624 - fax +39-010-6475660 > e-mail: *Riccardo.Albertoni@ge.imati.cnr.it* > <Riccardo.Albertoni@ge.imati.cnr.it> > Skype: callto://riccardoalbertoni/ > LinkedIn: *http://www.linkedin.com/in/riccardoalbertoni* > <http://www.linkedin.com/in/riccardoalbertoni> > www: *http://www.ge.imati.cnr.it/Albertoni* > <http://www.ge.imati.cnr.it/Albertoni> > *http://purl.oclc.org/NET/riccardoAlbertoni* > <http://purl.oclc.org/NET/riccardoAlbertoni> > FOAF:*http://purl.oclc.org/NET/RiccardoAlbertoni/foaf* > <http://purl.oclc.org/NET/RiccardoAlbertoni/foaf> > > ---------------------------------------------------------------------------- > >
Attachments
- image/gif attachment: ecblank.gif
- image/gif attachment: graycol.gif
Received on Thursday, 30 October 2014 14:10:48 UTC