W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > May 2016

Re: Help with Data Quality example

From: Riccardo Albertoni <riccardo.albertoni@ge.imati.cnr.it>
Date: Thu, 12 May 2016 17:25:14 +0200
Message-ID: <CAOHhXmSaB+B2RrXR10vJVL7X-cTTVXDraJxVRyaAFjc2yOb-zQ@mail.gmail.com>
To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>, Newton Calegari <newton@nic.br>, Caroline Burle <cburle@nic.br>
Cc: "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Dear BP Editors,
As mentioned in the previous mail, I have made some corrections on the DQV
example included in BP document, and reinserted the RDFa annotations in the
human readable example page.

I have also made some very minor changes on DQV draft, but having
 committed these changes after the changes made on BP,
I cannot apparently  push the latter changes on DQV draft without pushing
also the changes on BP.

For this reason,  I would appreciate if you could consider to accept my
changes (https://github.com/w3c/dwbp/pull/390)  before tomorrow's vote.

Thanks a lot,
Riccardo

On 10 May 2016 at 15:11, Riccardo Albertoni <
riccardo.albertoni@ge.imati.cnr.it> wrote:

> Hi Bernadette,
> I have just submitted the fowlloing  pull request  on Github
> https://github.com/w3c/dwbp/pull/390
> I did not merged it as I prefer  to leave to the editors the last word on
> it :)
>
> With this pull request I have made some  corrections on the DQV example
> included in BP doc, and  I have re-added the related   rdfa for human
> readable HTML example.
>
> I have realized that the DQV example in BP document uses different URI
> than the turtle generated by RDFa ( e.g, :stops-2015-05-05.csv is used  in
> the bp document  whilst  #timetable-001-CSV is used in the rdfa generated
> turtle). If not too much impacting in terms of changes, I suggest to
> consider to make the uris used in the two more consistent.
>
> Cheers,
> Riccardo
>
>
>
> On 5 May 2016 at 15:21, Bernadette Farias Lóscio <bfl@cin.ufpe.br> wrote:
>
>> It's ok Riccardo!
>>
>> We're gonna try to fix this. If we can't then you can do it for next week.
>>
>> Thanks!
>> Berna
>>
>> 2016-05-05 8:40 GMT-03:00 Riccardo Albertoni <
>> riccardo.albertoni@ge.imati.cnr.it>:
>>
>>> Hi Berna,
>>> I don't mind including them again, but I am afraid I won't be able to do
>>> it before next week.
>>> I hope that is not a big problem.
>>> Cheers,
>>> Riccardo
>>>
>>>
>>>
>>> On 4 May 2016 at 23:43, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>> wrote:
>>>
>>>> Hi Riccardo,
>>>>
>>>> I'm sorry but I had problems using the RDFa annotations :(
>>>>
>>>> Do you mind of including them again?
>>>>
>>>> Thanks a lot!
>>>> Berna
>>>>
>>>> 2016-05-04 17:59 GMT-03:00 Riccardo Albertoni <
>>>> riccardo.albertoni@ge.imati.cnr.it>:
>>>>
>>>>> Hi  Bernadette,
>>>>>
>>>>> It looks good to me.
>>>>> Just a minor remark,    in the previous version  of the human readable
>>>>> example page, I had inserted    some RDFa annotations for DQV statement
>>>>> that have disappeared in the current version.
>>>>> Actually the DQV related triples  are not distilled anymore by the
>>>>> RDFa 1.1 validator ( see [2]). Not a real problem, but I am mentioning  it
>>>>> since  I am not sure  whether  that was meant to be or  it  is an
>>>>> unexpected side effect of  latest  page changes.
>>>>>
>>>>> Cheers,
>>>>> Riccardo
>>>>>
>>>>>
>>>>> [2]
>>>>> http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fw3c.github.io%2Fdwbp%2Fdwbp-example.html&rdfa_lite=false&vocab_expansion=false&embedded_rdf=true&validate=yes&space_preserve=true&vocab_cache_report=false&vocab_cache_bypass=false
>>>>>
>>>>> On 4 May 2016 at 20:55, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>>> wrote:
>>>>>
>>>>>> Hi Riccardo,
>>>>>>
>>>>>> Thanks a lot for your feedback! In this case, for one dimension, we
>>>>>> can use different measures for different distributions/datasets. Is that
>>>>>> right?
>>>>>>
>>>>>> Then, the only information that could be general is the definition of
>>>>>> the dimensions. As we wont have a separate document to describe the
>>>>>> dimensions, I think we can keep the definitions together with the measures
>>>>>> and values.
>>>>>>
>>>>>> Doing this[1], the only change is the inclusion of the data quality
>>>>>> information as part of the CSV distribution rather than as a separate
>>>>>> section. Could you please take a look and tell me if this is ok?
>>>>>>
>>>>>> Thanks!
>>>>>> Berna
>>>>>>
>>>>>> [1] http://w3c.github.io/dwbp/dwbp-example.html
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2016-05-04 12:58 GMT-03:00 Riccardo Albertoni <
>>>>>> riccardo.albertoni@ge.imati.cnr.it>:
>>>>>>
>>>>>>> Dear BP Editors,
>>>>>>>
>>>>>>> Unfortunately, the values should be associated to Metrics not
>>>>>>> dimensions, as  you can have more than one metric for the same dimension,
>>>>>>>
>>>>>>>  I would  consider to change the Human readable example, as in the
>>>>>>> following
>>>>>>>
>>>>>>> a) In the section "Data Quality values", you can add the metric
>>>>>>> description, for example,
>>>>>>>
>>>>>>> -------------------------------------------------------------------------------------
>>>>>>> (Measured ??)  Dimension |   (Deployed ??)  Metric    |  value  |
>>>>>>>
>>>>>>> ----------------------------------------------------------------------------------
>>>>>>> Availability | dcat:downloadURL is available and if its value is
>>>>>>> dereferenceable| True (boolean)
>>>>>>> Completeness | Ratio between the number of objects represented in
>>>>>>> the cvs and the number of objects expected to be represented according to
>>>>>>> the declared dataset scope. |  0.5 (Double)
>>>>>>>
>>>>>>> b) in section "Quality Dimensions and Metrics", you can change the
>>>>>>>  section title in "Quality Dimensions", and  you can delete the metric
>>>>>>> column in the table. It should result in something like,
>>>>>>> -------------------------------
>>>>>>> Dimension |   Definition
>>>>>>> ---------------------------------
>>>>>>> Completeness | Refers to the degree to which all required
>>>>>>> information is present in a particular dataset.
>>>>>>> Availability | Availability of a dataset is the extent to which
>>>>>>> data (or some portion of it) is present, obtainable and ready for use.
>>>>>>>
>>>>>>>
>>>>>>> The above solution is perhaps less  appealing, but it is closer to
>>>>>>> the data quality vocabulary model.
>>>>>>> Does it work for you?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Riccardo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 4 May 2016 at 17:17, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Dear DQV editors,
>>>>>>>>
>>>>>>>> We are making the final updates on the DWBP document and we'd like
>>>>>>>> to ask your help with the human-readable version of the data quality
>>>>>>>> metadata [1].
>>>>>>>>
>>>>>>>> We made some changes on the html page with the human-readable
>>>>>>>> metadata and we changed the way that the data quality metadata is
>>>>>>>> presented. Now the values of the data quality dimensions are part of the
>>>>>>>> description of the CSV distribution. Please, take a look and tell us if
>>>>>>>> this makes sense.
>>>>>>>>
>>>>>>>> We also noticed that the description of dimensions and metrics is
>>>>>>>> general and maybe could be in another page rather than in the dataset page.
>>>>>>>> Does it make sense for you or do you think its better to keep these
>>>>>>>> descriptions together with the dataset description?
>>>>>>>>
>>>>>>>> Looking forward to your feedback!
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> BP Editors
>>>>>>>>
>>>>>>>> [1] http://w3c.github.io/dwbp/dwbp-example.html
>>>>>>>>
>>>>>>>> --
>>>>>>>> Bernadette Farias Lóscio
>>>>>>>> Centro de Informática
>>>>>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> --
>>>>>>>> This message has been scanned for viruses and dangerous content by
>>>>>>>> *E.F.A. Project* <http://www.efa-project.org>, and is believed to
>>>>>>>> be clean.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> ----------------------------------------------------------------------------
>>>>>>> Riccardo Albertoni
>>>>>>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>>>>>>> Magenes"
>>>>>>> Consiglio Nazionale delle Ricerche
>>>>>>> via de Marini 6 - 16149 GENOVA - ITALIA
>>>>>>> tel. +39-010-6475624 - fax +39-010-6475660
>>>>>>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>>>>>>> Skype: callto://riccardoalbertoni/
>>>>>>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>>>>>>> www: *MailScanner has detected a possible fraud attempt from
>>>>>>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>>>>>>> <http://www.ge.imati.cnr.it/Albertoni>
>>>>>>> http://purl.oclc.org/NET/riccardoAlbertoni
>>>>>>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>>>>>>
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Bernadette Farias Lóscio
>>>>>> Centro de Informática
>>>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>>>
>>>>>> ----------------------------------------------------------------------------
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> ----------------------------------------------------------------------------
>>>>> Riccardo Albertoni
>>>>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>>>>> Magenes"
>>>>> Consiglio Nazionale delle Ricerche
>>>>> via de Marini 6 - 16149 GENOVA - ITALIA
>>>>> tel. +39-010-6475624 - fax +39-010-6475660
>>>>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>>>>> Skype: callto://riccardoalbertoni/
>>>>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>>>>> www: *MailScanner has detected a possible fraud attempt from
>>>>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>>>>> <http://www.ge.imati.cnr.it/Albertoni>
>>>>> http://purl.oclc.org/NET/riccardoAlbertoni
>>>>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>>>>
>>>>> ----------------------------------------------------------------------------
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Bernadette Farias Lóscio
>>>> Centro de Informática
>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>
>>>> ----------------------------------------------------------------------------
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> ----------------------------------------------------------------------------
>>> Riccardo Albertoni
>>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>>> Magenes"
>>> Consiglio Nazionale delle Ricerche
>>> via de Marini 6 - 16149 GENOVA - ITALIA
>>> tel. +39-010-6475624 - fax +39-010-6475660
>>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>>> Skype: callto://riccardoalbertoni/
>>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>>> www: *MailScanner has detected a possible fraud attempt from
>>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>>> <http://www.ge.imati.cnr.it/Albertoni>
>>> http://purl.oclc.org/NET/riccardoAlbertoni
>>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>>
>>> ----------------------------------------------------------------------------
>>>
>>
>>
>>
>> --
>> Bernadette Farias Lóscio
>> Centro de Informática
>> Universidade Federal de Pernambuco - UFPE, Brazil
>>
>> ----------------------------------------------------------------------------
>>
>
>
>
> --
>
> ----------------------------------------------------------------------------
> Riccardo Albertoni
> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
> Magenes"
> Consiglio Nazionale delle Ricerche
> via de Marini 6 - 16149 GENOVA - ITALIA
> tel. +39-010-6475624 - fax +39-010-6475660
> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
> Skype: callto://riccardoalbertoni/
> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
> www: http://www.imati.cnr.it/ <http://www.ge.imati.cnr.it/Albertoni>
> http://purl.oclc.org/NET/riccardoAlbertoni
> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>
> ----------------------------------------------------------------------------
>



-- 
----------------------------------------------------------------------------
Riccardo Albertoni
Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
Magenes"
Consiglio Nazionale delle Ricerche
via de Marini 6 - 16149 GENOVA - ITALIA
tel. +39-010-6475624 - fax +39-010-6475660
e-mail: Riccardo.Albertoni@ge.imati.cnr.it
Skype: callto://riccardoalbertoni/
LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
www: http://www.imati.cnr.it/ <http://www.ge.imati.cnr.it/Albertoni>
http://purl.oclc.org/NET/riccardoAlbertoni
FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
----------------------------------------------------------------------------
Received on Thursday, 12 May 2016 15:25:47 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 12 May 2016 15:25:47 UTC