Re: Data quality and requirements - discussion for F2F?

+1

>> I recommend focusing on the details of data quality vocabularies and let
vendors and community groups determine how they are tabulated into metrics.



On Thu, Oct 30, 2014 at 6:50 AM, Steven Adler <adler1@us.ibm.com> wrote:

> Metrics change human behavior with superficial focus of attainment of
> desired factors instead of deeper understanding of underlying issues.  We
> all saw how this played out in banks prior to the Credit Crisis as CEO's
> became obsessed with managing VAR (Value at Risk), even if most did not
> understand how VAR was calculated.
>
> I recommend focusing on the details of data quality vocabularies and let
> vendors and community groups determine how they are tabulated into metrics.
>
>
> Best Regards,
>
> Steve
>
> Motto: "Do First, Think, Do it Again"
>
> [image: Inactive hide details for Riccardo Albertoni ---10/30/2014
> 06:37:59 AM---Hi All, I basically agree with Jeremy, I think we sho]Riccardo
> Albertoni ---10/30/2014 06:37:59 AM---Hi All, I basically agree with
>  Jeremy, I think we should define how quality
>
>
>
>    From:
>
>
> Riccardo Albertoni <riccardo.albertoni@ge.imati.cnr.it>
>
>    To:
>
>
> Makx Dekkers <mail@makxdekkers.com>
>
>    Cc:
>
>
> "Debattista, Jeremy" <Jeremy.Debattista@iais-extern.fraunhofer.de>, Bart
> van Leeuwen <bart_van_leeuwen@netage.nl>, Public DWBP WG <
> public-dwbp-wg@w3.org>, Antoine Isaac <aisaac@few.vu.nl>
>
>    Date:
>
>
> 10/30/2014 06:37 AM
>
>    Subject:
>
>
> Re: Data quality and requirements - discussion for F2F?
> ------------------------------
>
>
>
> Hi All,
> I basically agree with  Jeremy, I think we should define how quality
> metadata can be represented at an abstract level in a metadata model( e.g.
> Ontology). In my opinion both human- focused information and metrics based
> quality should be represented in the model provided that  there are use
> cases grounding these needs.
>
> In order to make quality of dataset comparable and objective, I think It
> would be great to have a set of recommended metrics and quality dimensions,
> even if I am not sure such a set can be easily identified.
>
> Anyway, If a set of metrics is going to be defined and "recommended" I
> think that set should be extensible, as I tried to  stress proposing  the
> LuSTRE use case and the Q-MetricExtensibility requirement in my e-mail last
> week (see Quality requirements and a new use case for UCR [1] ).
>
>
> Regards,
> Riccardo
>
> [1]
> *http://lists.w3.org/Archives/Public/public-dwbp-comments/2014Oct/0002.html*
> <http://lists.w3.org/Archives/Public/public-dwbp-comments/2014Oct/0002.html>
>
>
> On 30 October 2014 12:58, Makx Dekkers <*mail@makxdekkers.com*
> <mail@makxdekkers.com>> wrote:
>
>    As I am following this discussion, it occurred to me that maybe we
>    could look also at who will use any statements about  and what for.
>
>    On one hand, there is quality-related information that is for human
>    consumption, e.g. things like the information provided at
>    *http://www.legislation.gov.uk/help#aboutChangesToLeg*
>    <http://www.legislation.gov.uk/help#aboutChangesToLeg> and other FAQ
>    items on that page. Such information can be used by humans to take
>    decisions about whether they want to use the data.
>
>
>
>    On the other hand, precise metrics may be used by programs to
>    pre-select collections of data, but in that case we need to understand
>    maybe a little bit more what kind of programs or applications would consume
>    the metrics and for what purpose.
>
>
>
>    It seems to me that maybe the human- focused information is a little
>    easier to define (e.g. using the *legislation.gov.uk*
>    <http://legislation.gov.uk/> as a starting point). We could start to
>    define a small set of properties for those (either as text or using some
>    controlled vocabulary) and look at the metrics later on the basis of
>    existing applications that use quality metrics in practice. I agree that
>    metrics are not that easy to define, and probably also complex to use.
>
>
>
>    Makx
>
>
>
>    *De:* Debattista, Jeremy [mailto:
>    *Jeremy.Debattista@iais-extern.fraunhofer.de*
>    <Jeremy.Debattista@iais-extern.fraunhofer.de>]
> * Enviado el:* jueves, 30 de octubre de 2014 11:11
> * Para:* Bart van Leeuwen
> * CC:* Public DWBP WG; Antoine Isaac
> * Asunto:* Re: Data quality and requirements - discussion for F2F?
>
>
>
>    Hi Bart, Antoine
>
>
>
>    I agree with both of you that defining a vocabulary based on metrics
>    is hard. From my work on data quality, I realised that different domains,
>    use cases etc might require different metrics. Of course, there are those
>    metrics that would be suitable for most of the use cases. What I found
>    useful was to define how quality metadata should be represented at an
>    abstract level [1]. Then based on this abstract ontology, we defined a
>    number of quality metrics [2], some of which might be similar to those
>    extracted from the DWBP use cases. On the whole, my opinion is that we have
>    to provide a pragmatic solution that would be suitable for everyone within
>    the community, i.e. in the future other interested parties should be able
>    to define quality metrics that can be easily interoperable with other
>    defined quality metrics.
>
>
>
>    I would gladly join the F2F discussion remotely, if it won’t be after
>    10pm (CET) :).
>
>
>
>    Cheers,
>
>    Jer
>
>
>
>
>
>    [1]
>    *https://raw.githubusercontent.com/EIS-Bonn/Luzzu/master/luzzu-semantics/src/main/resources/vocabularies/daq/daq.trig*
>    <https://raw.githubusercontent.com/EIS-Bonn/Luzzu/master/luzzu-semantics/src/main/resources/vocabularies/daq/daq.trig>
>
>    [2]
>    *https://raw.githubusercontent.com/diachron/quality/luzzu-integration/src/main/resources/vocabularies/dqm/dqm.trig*
>    <https://raw.githubusercontent.com/diachron/quality/luzzu-integration/src/main/resources/vocabularies/dqm/dqm.trig>
>
>
>
>    On 29 Oct 2014, at 17:17, Bart van Leeuwen <
>    *bart_van_leeuwen@netage.nl* <bart_van_leeuwen@netage.nl>> wrote:
>
>
>     Hi Antoine,
>
>       Last night I had a conversation with Bernadette on this topic which
>       ended up in a nice discussion.
>       I'm on the same page with you that I think the Quality vocabulary
>       is rather hard to define if we will focus on metrics.
>
>       I Hope we have some good amount of time during the F2F to discuss
>       it.
>
>       Met Vriendelijke Groet / With Kind Regards
>       Bart van Leeuwen
>
>       ##############################################################
>       # twitter: @semanticfire
>       # *netage.nl* <http://netage.nl/>
>       # *http://netage.nl* <http://netage.nl/>
>       # Enschedepad 76
>       # 1324 GJ Almere
>       # The Netherlands
>       # tel. *+31(0)36-5347479* <%2B31%280%2936-5347479>
>       ##############################################################
>
>
>
>       From:        Antoine Isaac <*aisaac@few.vu.nl* <aisaac@few.vu.nl>>
>       To:        Public DWBP WG <*public-dwbp-wg@w3.org*
>       <public-dwbp-wg@w3.org>>
>       Date:        29-10-2014 17:07
>       Subject:        Data quality and requirements - discussion for F2F?
>
>       ------------------------------
>
>
>
>
>       Dear all,
>
>       As a preparation to the F2F discussions on vocabularies, I have
>       checked the latest version of the UCR document [1]. The progress that has
>       been made on describing use cases and identifying requirements is
>       impressive.
>       In particular, it is great the categorization of requirements to
>       identify requirements most important for our vocabulary work, including the
>       one on quality and granularity [2].
>
>       Yet, I am still not sure of the scoping of the quality vocabulary.
>       I've looked at all requirements, one could say that many could impact the
>       scope of a vocabulary to be used to document quality. Some thoughts are on
>       a new wiki page [3]. I admittedly played the devil's advocate there, i.e. I
>       was very liberal when judging a requirement could impact quality and
>       granularity. But in fact when looking at what various UCs have to say about
>       quality, I am wondering whether I am the only one confused! I have compiled
>       a list of quotes from the UC descriptions [3], which shows that considering
>       all contributors, a very wide definition of quality is still on order.
>
>       My wish for the F2F discussion would be that the group spend some
>       time going through the requirements, and discuss whether they should be in
>       scope of the vocabulary.
>       Or to put it in other words, decide whether the vocabulary should
>       include elements for documenting whether a dataset meet the considered
>       requirements, ie., there is metadata for data re-users to understand the
>       performance of the dataset against the requirements the group has
>       identified.
>
>       A reminder, all kind of pointers for the quality work are gathered
>       at [4]. Including first vocabulary design by Phil.
>
>       Best regards,
>
>       Antoine
>
>       [1] *http://www.w3.org/TR/2014/WD-dwbp-ucr-20141014/*
>       <http://www.w3.org/TR/2014/WD-dwbp-ucr-20141014/>
>       [2]
>       *http://www.w3.org/TR/dwbp-ucr/#requirements-for-quality-and-granularity-description-vocabulary*
>       <http://www.w3.org/TR/dwbp-ucr/#requirements-for-quality-and-granularity-description-vocabulary>
>       [3] *https://www.w3.org/2013/dwbp/wiki/UCRs_and_Quality*
>       <https://www.w3.org/2013/dwbp/wiki/UCRs_and_Quality>
>       [4] *https://www.w3.org/2013/dwbp/wiki/Data_quality_notes*
>       <https://www.w3.org/2013/dwbp/wiki/Data_quality_notes>
>
>
>
>
>    --
>    This message was scanned by ESVA and is believed to be clean.
> *Click to report as spam. Segnala come spam.*
>    <http://mailscanner.ge.cnr.it/cgi-bin/learn-msg.cgi?id=F420F28CEE.DFB6D>
>
>
>
>
>
>
> --
>
> ----------------------------------------------------------------------------
> Riccardo Albertoni
> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
> Magenes"
> Consiglio Nazionale delle Ricerche
> via de Marini 6 - 16149 GENOVA - ITALIA
> tel. +39-010-6475624 - fax +39-010-6475660
> e-mail: *Riccardo.Albertoni@ge.imati.cnr.it*
> <Riccardo.Albertoni@ge.imati.cnr.it>
> Skype: callto://riccardoalbertoni/
> LinkedIn: *http://www.linkedin.com/in/riccardoalbertoni*
> <http://www.linkedin.com/in/riccardoalbertoni>
> www: *http://www.ge.imati.cnr.it/Albertoni*
> <http://www.ge.imati.cnr.it/Albertoni>
> *http://purl.oclc.org/NET/riccardoAlbertoni*
> <http://purl.oclc.org/NET/riccardoAlbertoni>
> FOAF:*http://purl.oclc.org/NET/RiccardoAlbertoni/foaf*
> <http://purl.oclc.org/NET/RiccardoAlbertoni/foaf>
>
> ----------------------------------------------------------------------------
>
>

Received on Thursday, 30 October 2014 14:10:48 UTC