W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > May 2016

Re: Help with Data Quality example

From: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
Date: Thu, 12 May 2016 14:34:06 -0300
Message-ID: <CANx1PzxamCEye-Whk_Cv_pR34ZysbOw6dZwTYPenSDL3WvFYbw@mail.gmail.com>
To: Riccardo Albertoni <riccardo.albertoni@ge.imati.cnr.it>
Cc: Newton Calegari <newton@nic.br>, Caroline Burle <cburle@nic.br>, "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Hi Riccardo,

Thanks a lot for updating the example and for pointing out the error in the
ttl file. We're gonna fix this ;)

I just made the merge!

cheers,
Berna

2016-05-12 12:25 GMT-03:00 Riccardo Albertoni <
riccardo.albertoni@ge.imati.cnr.it>:

> Dear BP Editors,
> As mentioned in the previous mail, I have made some corrections on the DQV
> example included in BP document, and reinserted the RDFa annotations in the
> human readable example page.
>
> I have also made some very minor changes on DQV draft, but having
>  committed these changes after the changes made on BP,
> I cannot apparently  push the latter changes on DQV draft without pushing
> also the changes on BP.
>
> For this reason,  I would appreciate if you could consider to accept my
> changes (https://github.com/w3c/dwbp/pull/390)  before tomorrow's vote.
>
> Thanks a lot,
> Riccardo
>
> On 10 May 2016 at 15:11, Riccardo Albertoni <
> riccardo.albertoni@ge.imati.cnr.it> wrote:
>
>> Hi Bernadette,
>> I have just submitted the fowlloing  pull request  on Github
>> https://github.com/w3c/dwbp/pull/390
>> I did not merged it as I prefer  to leave to the editors the last word on
>> it :)
>>
>> With this pull request I have made some  corrections on the DQV example
>> included in BP doc, and  I have re-added the related   rdfa for human
>> readable HTML example.
>>
>> I have realized that the DQV example in BP document uses different URI
>> than the turtle generated by RDFa ( e.g, :stops-2015-05-05.csv is used  in
>> the bp document  whilst  #timetable-001-CSV is used in the rdfa
>> generated turtle). If not too much impacting in terms of changes, I suggest
>> to consider to make the uris used in the two more consistent.
>>
>> Cheers,
>> Riccardo
>>
>>
>>
>> On 5 May 2016 at 15:21, Bernadette Farias Lóscio <bfl@cin.ufpe.br> wrote:
>>
>>> It's ok Riccardo!
>>>
>>> We're gonna try to fix this. If we can't then you can do it for next
>>> week.
>>>
>>> Thanks!
>>> Berna
>>>
>>> 2016-05-05 8:40 GMT-03:00 Riccardo Albertoni <
>>> riccardo.albertoni@ge.imati.cnr.it>:
>>>
>>>> Hi Berna,
>>>> I don't mind including them again, but I am afraid I won't be able to
>>>> do it before next week.
>>>> I hope that is not a big problem.
>>>> Cheers,
>>>> Riccardo
>>>>
>>>>
>>>>
>>>> On 4 May 2016 at 23:43, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>> wrote:
>>>>
>>>>> Hi Riccardo,
>>>>>
>>>>> I'm sorry but I had problems using the RDFa annotations :(
>>>>>
>>>>> Do you mind of including them again?
>>>>>
>>>>> Thanks a lot!
>>>>> Berna
>>>>>
>>>>> 2016-05-04 17:59 GMT-03:00 Riccardo Albertoni <
>>>>> riccardo.albertoni@ge.imati.cnr.it>:
>>>>>
>>>>>> Hi  Bernadette,
>>>>>>
>>>>>> It looks good to me.
>>>>>> Just a minor remark,    in the previous version  of the human
>>>>>> readable example page, I had inserted    some RDFa annotations for DQV
>>>>>> statement that have disappeared in the current version.
>>>>>> Actually the DQV related triples  are not distilled anymore by the
>>>>>> RDFa 1.1 validator ( see [2]). Not a real problem, but I am mentioning  it
>>>>>> since  I am not sure  whether  that was meant to be or  it  is an
>>>>>> unexpected side effect of  latest  page changes.
>>>>>>
>>>>>> Cheers,
>>>>>> Riccardo
>>>>>>
>>>>>>
>>>>>> [2]
>>>>>> http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fw3c.github.io%2Fdwbp%2Fdwbp-example.html&rdfa_lite=false&vocab_expansion=false&embedded_rdf=true&validate=yes&space_preserve=true&vocab_cache_report=false&vocab_cache_bypass=false
>>>>>>
>>>>>> On 4 May 2016 at 20:55, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Riccardo,
>>>>>>>
>>>>>>> Thanks a lot for your feedback! In this case, for one dimension, we
>>>>>>> can use different measures for different distributions/datasets. Is that
>>>>>>> right?
>>>>>>>
>>>>>>> Then, the only information that could be general is the definition
>>>>>>> of the dimensions. As we wont have a separate document to describe the
>>>>>>> dimensions, I think we can keep the definitions together with the measures
>>>>>>> and values.
>>>>>>>
>>>>>>> Doing this[1], the only change is the inclusion of the data quality
>>>>>>> information as part of the CSV distribution rather than as a separate
>>>>>>> section. Could you please take a look and tell me if this is ok?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Berna
>>>>>>>
>>>>>>> [1] http://w3c.github.io/dwbp/dwbp-example.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2016-05-04 12:58 GMT-03:00 Riccardo Albertoni <
>>>>>>> riccardo.albertoni@ge.imati.cnr.it>:
>>>>>>>
>>>>>>>> Dear BP Editors,
>>>>>>>>
>>>>>>>> Unfortunately, the values should be associated to Metrics not
>>>>>>>> dimensions, as  you can have more than one metric for the same dimension,
>>>>>>>>
>>>>>>>>  I would  consider to change the Human readable example, as in the
>>>>>>>> following
>>>>>>>>
>>>>>>>> a) In the section "Data Quality values", you can add the metric
>>>>>>>> description, for example,
>>>>>>>>
>>>>>>>> -------------------------------------------------------------------------------------
>>>>>>>> (Measured ??)  Dimension |   (Deployed ??)  Metric    |  value  |
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------------------
>>>>>>>> Availability | dcat:downloadURL is available and if its value is
>>>>>>>> dereferenceable| True (boolean)
>>>>>>>> Completeness | Ratio between the number of objects represented in
>>>>>>>> the cvs and the number of objects expected to be represented according to
>>>>>>>> the declared dataset scope. |  0.5 (Double)
>>>>>>>>
>>>>>>>> b) in section "Quality Dimensions and Metrics", you can change the
>>>>>>>>  section title in "Quality Dimensions", and  you can delete the metric
>>>>>>>> column in the table. It should result in something like,
>>>>>>>> -------------------------------
>>>>>>>> Dimension |   Definition
>>>>>>>> ---------------------------------
>>>>>>>> Completeness | Refers to the degree to which all required
>>>>>>>> information is present in a particular dataset.
>>>>>>>> Availability | Availability of a dataset is the extent to which
>>>>>>>> data (or some portion of it) is present, obtainable and ready for use.
>>>>>>>>
>>>>>>>>
>>>>>>>> The above solution is perhaps less  appealing, but it is closer to
>>>>>>>> the data quality vocabulary model.
>>>>>>>> Does it work for you?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Riccardo
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4 May 2016 at 17:17, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Dear DQV editors,
>>>>>>>>>
>>>>>>>>> We are making the final updates on the DWBP document and we'd like
>>>>>>>>> to ask your help with the human-readable version of the data quality
>>>>>>>>> metadata [1].
>>>>>>>>>
>>>>>>>>> We made some changes on the html page with the human-readable
>>>>>>>>> metadata and we changed the way that the data quality metadata is
>>>>>>>>> presented. Now the values of the data quality dimensions are part of the
>>>>>>>>> description of the CSV distribution. Please, take a look and tell us if
>>>>>>>>> this makes sense.
>>>>>>>>>
>>>>>>>>> We also noticed that the description of dimensions and metrics is
>>>>>>>>> general and maybe could be in another page rather than in the dataset page.
>>>>>>>>> Does it make sense for you or do you think its better to keep these
>>>>>>>>> descriptions together with the dataset description?
>>>>>>>>>
>>>>>>>>> Looking forward to your feedback!
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> BP Editors
>>>>>>>>>
>>>>>>>>> [1] http://w3c.github.io/dwbp/dwbp-example.html
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Bernadette Farias Lóscio
>>>>>>>>> Centro de Informática
>>>>>>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>>>>>>
>>>>>>>>> ----------------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> This message has been scanned for viruses and dangerous content by
>>>>>>>>> *E.F.A. Project* <http://www.efa-project.org>, and is believed to
>>>>>>>>> be clean.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------------
>>>>>>>> Riccardo Albertoni
>>>>>>>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>>>>>>>> Magenes"
>>>>>>>> Consiglio Nazionale delle Ricerche
>>>>>>>> via de Marini 6 - 16149 GENOVA - ITALIA
>>>>>>>> tel. +39-010-6475624 - fax +39-010-6475660
>>>>>>>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>>>>>>>> Skype: callto://riccardoalbertoni/
>>>>>>>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>>>>>>>> www: *MailScanner has detected a possible fraud attempt from
>>>>>>>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>>>>>>>> <http://www.ge.imati.cnr.it/Albertoni>
>>>>>>>> http://purl.oclc.org/NET/riccardoAlbertoni
>>>>>>>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------------
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Bernadette Farias Lóscio
>>>>>>> Centro de Informática
>>>>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>>>>
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ----------------------------------------------------------------------------
>>>>>> Riccardo Albertoni
>>>>>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>>>>>> Magenes"
>>>>>> Consiglio Nazionale delle Ricerche
>>>>>> via de Marini 6 - 16149 GENOVA - ITALIA
>>>>>> tel. +39-010-6475624 - fax +39-010-6475660
>>>>>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>>>>>> Skype: callto://riccardoalbertoni/
>>>>>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>>>>>> www: *MailScanner has detected a possible fraud attempt from
>>>>>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>>>>>> <http://www.ge.imati.cnr.it/Albertoni>
>>>>>> http://purl.oclc.org/NET/riccardoAlbertoni
>>>>>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>>>>>
>>>>>> ----------------------------------------------------------------------------
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Bernadette Farias Lóscio
>>>>> Centro de Informática
>>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>>
>>>>> ----------------------------------------------------------------------------
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ----------------------------------------------------------------------------
>>>> Riccardo Albertoni
>>>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>>>> Magenes"
>>>> Consiglio Nazionale delle Ricerche
>>>> via de Marini 6 - 16149 GENOVA - ITALIA
>>>> tel. +39-010-6475624 - fax +39-010-6475660
>>>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>>>> Skype: callto://riccardoalbertoni/
>>>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>>>> www: *MailScanner has detected a possible fraud attempt from
>>>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>>>> <http://www.ge.imati.cnr.it/Albertoni>
>>>> http://purl.oclc.org/NET/riccardoAlbertoni
>>>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>>>
>>>> ----------------------------------------------------------------------------
>>>>
>>>
>>>
>>>
>>> --
>>> Bernadette Farias Lóscio
>>> Centro de Informática
>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>
>>> ----------------------------------------------------------------------------
>>>
>>
>>
>>
>> --
>>
>> ----------------------------------------------------------------------------
>> Riccardo Albertoni
>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>> Magenes"
>> Consiglio Nazionale delle Ricerche
>> via de Marini 6 - 16149 GENOVA - ITALIA
>> tel. +39-010-6475624 - fax +39-010-6475660
>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>> Skype: callto://riccardoalbertoni/
>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>> www: http://www.imati.cnr.it/ <http://www.ge.imati.cnr.it/Albertoni>
>> http://purl.oclc.org/NET/riccardoAlbertoni
>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>
>> ----------------------------------------------------------------------------
>>
>
>
>
> --
>
> ----------------------------------------------------------------------------
> Riccardo Albertoni
> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
> Magenes"
> Consiglio Nazionale delle Ricerche
> via de Marini 6 - 16149 GENOVA - ITALIA
> tel. +39-010-6475624 - fax +39-010-6475660
> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
> Skype: callto://riccardoalbertoni/
> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
> www: http://www.imati.cnr.it/ <http://www.ge.imati.cnr.it/Albertoni>
> http://purl.oclc.org/NET/riccardoAlbertoni
> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>
> ----------------------------------------------------------------------------
>



-- 
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------
Received on Thursday, 12 May 2016 17:34:56 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 12 May 2016 17:34:56 UTC