Re: DQV - metrics related to the completeness dimension from Steven Adler on 2015-09-30 (public-dwbp-wg@w3.org from September 2015)

From: Steven Adler <adler1@us.ibm.com>
Date: Wed, 30 Sep 2015 20:53:52 +0000
To: "Annette Greiner" <amgreiner@lbl.gov>
Cc: "Nandana Mihindukulasooriya" <nmihindu@fi.upm.es>, "Debattista, Jeremy" <Jeremy.Debattista@iais.fraunhofer.de>, "Data on the Web Best Practices Working Group" <public-dwbp-wg@w3.org>, "Makx Dekkers" <mail@makxdekkers.com>
Message-Id: <201509302053.t8UKrxAP022692@d01av02.pok.ibm.com>

I'm willing to explore the idea if we can also provide metadata that allows the public to challenge these assertions of completion and asks publishers to detail collection methods and link records to other data sources for corroboration.  Data Quality is never a state.  It is an ongoing dialectic process of discovery, validation, and rediscovery.  Data decays at different rates depending on what it describes and high quality one day may become low quality another just because of misses refreshes.  The public should have the right and ability to find data quality problems alert the publishers and the other data consumers and even to amend the data if the publishers wishes .

Best Regards,

Steve


   Annette Greiner --- Re: DQV - metrics related to the completeness dimension --- 
    From:"Annette Greiner" <amgreiner@lbl.gov>To:"Steven Adler" <adler1@us.ibm.com>Cc:"Nandana Mihindukulasooriya" <nmihindu@fi.upm.es>, "Debattista, Jeremy" <Jeremy.Debattista@iais.fraunhofer.de>, "Data on the Web Best Practices Working Group" <public-dwbp-wg@w3.org>, "Makx Dekkers" <mail@makxdekkers.com>Date:Wed, Sep 30, 2015 4:42 PMSubject:Re: DQV - metrics related to the completeness dimension
  I agree that we can’t create a standard for one interest group. My point is that the DQV can serve many purposes. You may have a use case for not using the vocabulary to assert objective measures. I still don’t see any reason that those with other use cases shouldn’t have the option to use the DQV in ways that it can be meaningful for them.
     (And I imagine your audience thought you were asking whether their government could be objective about its assessment of its own data, which is pretty different from whether it might state that a dataset doesn’t contain numbers for a particular province, for example.)
   
       -Annette
          
     
           --
      Annette Greiner
      NERSC Data and Analytics Services
      Lawrence Berkeley National Laboratory
      510-495-2935
      
     
    
               On Sep 30, 2015, at 1:27 PM, Steven Adler <adler1@us.ibm.com> wrote:
          
           Let me rephrase.  We can't create a standard for the best behaviour of one interest group.  We have to create a standard for many interests and behaviours.  

I'm at the World Bank today participating in a forum on Investing in Sierra Leone Diaspora and I asked my audience if they would ever trust their government to asset their own data quality with objective assertions and they laughed.  That government can't be trusted to keep a decision for more than a week.  

Best Regards,

Steve
      
      
      
                     Annette Greiner --- Re: DQV - metrics related to the completeness dimension ---
                      From:"Annette Greiner" <amgreiner@lbl.gov>To:"Steven Adler" <adler1@us.ibm.com>Cc:"Nandana Mihindukulasooriya" <nmihindu@fi.upm.es>, "Debattista, Jeremy" <Jeremy.Debattista@iais.fraunhofer.de>, "Data on the Web Best Practices Working Group" <public-dwbp-wg@w3.org>, "Makx Dekkers" <mail@makxdekkers.com>Date:Wed, Sep 30, 2015 3:14 PMSubject:Re: DQV - metrics related to the completeness dimension
        I mean to challenge your assumption that we are not creating the DQV for use by scientists to make statements about the completeness of data sources. I think that is a mistake.
        
                 --
         Annette Greiner
         NERSC Data and Analytics Services
         Lawrence Berkeley National Laboratory
         510-495-2935
                
                           On Sep 30, 2015, at 11:52 AM, Steven Adler <adler1@us.ibm.com> wrote:
                  
                   Because people don't know what they don't know.  Scientists, politicians, data experts - anyone who published data has limited resources to do so and poor data quality is endemic to publishing.   Sources have to be corroborated and we can make it easier to corroborate by building into the vocabulary.

Best Regards,

Steve
          
          
          
                                 Annette Greiner --- Re: DQV - metrics related to the completeness dimension ---
                                  From:"Annette Greiner" <amgreiner@lbl.gov>To:"Steven Adler" <adler1@us.ibm.com>Cc:"Nandana Mihindukulasooriya" <nmihindu@fi.upm.es>, "Debattista, Jeremy" <Jeremy.Debattista@iais.fraunhofer.de>, "Data on the Web Best Practices Working Group" <public-dwbp-wg@w3.org>, "Makx Dekkers" <mail@makxdekkers.com>Date:Wed, Sep 30, 2015 1:29 PMSubject:Re: DQV - metrics related to the completeness dimension
            Why do you insist on this? My primary interest in this group is on behalf of scientists. I think they would welcome a way to express what they see as the completeness of a dataset to their colleagues. -Annette
                                       
                                          On Sep 30, 2015, at 6:05 AM, Steven Adler wrote: > I want to say emphatically that we are not dealing with scientists publishing papers and making scientific statements about completeness of data sources. We are talking about organizations with financial interests in asserting their point of view when they publish data. We must insist that one assertion of quality is never enough.

Received on Wednesday, 30 September 2015 20:54:33 UTC