W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > May 2016

Re: Help with Data Quality example

From: Riccardo Albertoni <riccardo.albertoni@ge.imati.cnr.it>
Date: Tue, 10 May 2016 15:11:14 +0200
Message-ID: <CAOHhXmQF5VcNm=bLKdGU3aRtNwHmMj3cW6PAQB-aEaj=1gd=HQ@mail.gmail.com>
To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
Cc: Antoine Isaac <aisaac@few.vu.nl>, "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Hi Bernadette,
I have just submitted the fowlloing  pull request  on Github
https://github.com/w3c/dwbp/pull/390
I did not merged it as I prefer  to leave to the editors the last word on
it :)

With this pull request I have made some  corrections on the DQV example
included in BP doc, and  I have re-added the related   rdfa for human
readable HTML example.

I have realized that the DQV example in BP document uses different URI than
the turtle generated by RDFa ( e.g, :stops-2015-05-05.csv is used  in the
bp document  whilst  #timetable-001-CSV is used in the rdfa generated
turtle). If not too much impacting in terms of changes, I suggest to
consider to make the uris used in the two more consistent.

Cheers,
Riccardo



On 5 May 2016 at 15:21, Bernadette Farias Lóscio <bfl@cin.ufpe.br> wrote:

> It's ok Riccardo!
>
> We're gonna try to fix this. If we can't then you can do it for next week.
>
> Thanks!
> Berna
>
> 2016-05-05 8:40 GMT-03:00 Riccardo Albertoni <
> riccardo.albertoni@ge.imati.cnr.it>:
>
>> Hi Berna,
>> I don't mind including them again, but I am afraid I won't be able to do
>> it before next week.
>> I hope that is not a big problem.
>> Cheers,
>> Riccardo
>>
>>
>>
>> On 4 May 2016 at 23:43, Bernadette Farias Lóscio <bfl@cin.ufpe.br> wrote:
>>
>>> Hi Riccardo,
>>>
>>> I'm sorry but I had problems using the RDFa annotations :(
>>>
>>> Do you mind of including them again?
>>>
>>> Thanks a lot!
>>> Berna
>>>
>>> 2016-05-04 17:59 GMT-03:00 Riccardo Albertoni <
>>> riccardo.albertoni@ge.imati.cnr.it>:
>>>
>>>> Hi  Bernadette,
>>>>
>>>> It looks good to me.
>>>> Just a minor remark,    in the previous version  of the human readable
>>>> example page, I had inserted    some RDFa annotations for DQV statement
>>>> that have disappeared in the current version.
>>>> Actually the DQV related triples  are not distilled anymore by the RDFa
>>>> 1.1 validator ( see [2]). Not a real problem, but I am mentioning  it since
>>>>  I am not sure  whether  that was meant to be or  it  is an unexpected side
>>>> effect of  latest  page changes.
>>>>
>>>> Cheers,
>>>> Riccardo
>>>>
>>>>
>>>> [2]
>>>> http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fw3c.github.io%2Fdwbp%2Fdwbp-example.html&rdfa_lite=false&vocab_expansion=false&embedded_rdf=true&validate=yes&space_preserve=true&vocab_cache_report=false&vocab_cache_bypass=false
>>>>
>>>> On 4 May 2016 at 20:55, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>> wrote:
>>>>
>>>>> Hi Riccardo,
>>>>>
>>>>> Thanks a lot for your feedback! In this case, for one dimension, we
>>>>> can use different measures for different distributions/datasets. Is that
>>>>> right?
>>>>>
>>>>> Then, the only information that could be general is the definition of
>>>>> the dimensions. As we wont have a separate document to describe the
>>>>> dimensions, I think we can keep the definitions together with the measures
>>>>> and values.
>>>>>
>>>>> Doing this[1], the only change is the inclusion of the data quality
>>>>> information as part of the CSV distribution rather than as a separate
>>>>> section. Could you please take a look and tell me if this is ok?
>>>>>
>>>>> Thanks!
>>>>> Berna
>>>>>
>>>>> [1] http://w3c.github.io/dwbp/dwbp-example.html
>>>>>
>>>>>
>>>>>
>>>>> 2016-05-04 12:58 GMT-03:00 Riccardo Albertoni <
>>>>> riccardo.albertoni@ge.imati.cnr.it>:
>>>>>
>>>>>> Dear BP Editors,
>>>>>>
>>>>>> Unfortunately, the values should be associated to Metrics not
>>>>>> dimensions, as  you can have more than one metric for the same dimension,
>>>>>>
>>>>>>  I would  consider to change the Human readable example, as in the
>>>>>> following
>>>>>>
>>>>>> a) In the section "Data Quality values", you can add the metric
>>>>>> description, for example,
>>>>>>
>>>>>> -------------------------------------------------------------------------------------
>>>>>> (Measured ??)  Dimension |   (Deployed ??)  Metric    |  value  |
>>>>>>
>>>>>> ----------------------------------------------------------------------------------
>>>>>> Availability | dcat:downloadURL is available and if its value is
>>>>>> dereferenceable| True (boolean)
>>>>>> Completeness | Ratio between the number of objects represented in the
>>>>>> cvs and the number of objects expected to be represented according to the
>>>>>> declared dataset scope. |  0.5 (Double)
>>>>>>
>>>>>> b) in section "Quality Dimensions and Metrics", you can change the
>>>>>>  section title in "Quality Dimensions", and  you can delete the metric
>>>>>> column in the table. It should result in something like,
>>>>>> -------------------------------
>>>>>> Dimension |   Definition
>>>>>> ---------------------------------
>>>>>> Completeness | Refers to the degree to which all required
>>>>>> information is present in a particular dataset.
>>>>>> Availability | Availability of a dataset is the extent to which data
>>>>>> (or some portion of it) is present, obtainable and ready for use.
>>>>>>
>>>>>>
>>>>>> The above solution is perhaps less  appealing, but it is closer to
>>>>>> the data quality vocabulary model.
>>>>>> Does it work for you?
>>>>>>
>>>>>> Cheers,
>>>>>> Riccardo
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 4 May 2016 at 17:17, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>>>> wrote:
>>>>>>
>>>>>>> Dear DQV editors,
>>>>>>>
>>>>>>> We are making the final updates on the DWBP document and we'd like
>>>>>>> to ask your help with the human-readable version of the data quality
>>>>>>> metadata [1].
>>>>>>>
>>>>>>> We made some changes on the html page with the human-readable
>>>>>>> metadata and we changed the way that the data quality metadata is
>>>>>>> presented. Now the values of the data quality dimensions are part of the
>>>>>>> description of the CSV distribution. Please, take a look and tell us if
>>>>>>> this makes sense.
>>>>>>>
>>>>>>> We also noticed that the description of dimensions and metrics is
>>>>>>> general and maybe could be in another page rather than in the dataset page.
>>>>>>> Does it make sense for you or do you think its better to keep these
>>>>>>> descriptions together with the dataset description?
>>>>>>>
>>>>>>> Looking forward to your feedback!
>>>>>>>
>>>>>>> Thanks,
>>>>>>> BP Editors
>>>>>>>
>>>>>>> [1] http://w3c.github.io/dwbp/dwbp-example.html
>>>>>>>
>>>>>>> --
>>>>>>> Bernadette Farias Lóscio
>>>>>>> Centro de Informática
>>>>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>>>>
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>
>>>>>>> --
>>>>>>> This message has been scanned for viruses and dangerous content by
>>>>>>> *E.F.A. Project* <http://www.efa-project.org>, and is believed to
>>>>>>> be clean.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ----------------------------------------------------------------------------
>>>>>> Riccardo Albertoni
>>>>>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>>>>>> Magenes"
>>>>>> Consiglio Nazionale delle Ricerche
>>>>>> via de Marini 6 - 16149 GENOVA - ITALIA
>>>>>> tel. +39-010-6475624 - fax +39-010-6475660
>>>>>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>>>>>> Skype: callto://riccardoalbertoni/
>>>>>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>>>>>> www: *MailScanner has detected a possible fraud attempt from
>>>>>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>>>>>> <http://www.ge.imati.cnr.it/Albertoni>
>>>>>> http://purl.oclc.org/NET/riccardoAlbertoni
>>>>>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>>>>>
>>>>>> ----------------------------------------------------------------------------
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Bernadette Farias Lóscio
>>>>> Centro de Informática
>>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>>
>>>>> ----------------------------------------------------------------------------
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ----------------------------------------------------------------------------
>>>> Riccardo Albertoni
>>>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>>>> Magenes"
>>>> Consiglio Nazionale delle Ricerche
>>>> via de Marini 6 - 16149 GENOVA - ITALIA
>>>> tel. +39-010-6475624 - fax +39-010-6475660
>>>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>>>> Skype: callto://riccardoalbertoni/
>>>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>>>> www: *MailScanner has detected a possible fraud attempt from
>>>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>>>> <http://www.ge.imati.cnr.it/Albertoni>
>>>> http://purl.oclc.org/NET/riccardoAlbertoni
>>>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>>>
>>>> ----------------------------------------------------------------------------
>>>>
>>>
>>>
>>>
>>> --
>>> Bernadette Farias Lóscio
>>> Centro de Informática
>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>
>>> ----------------------------------------------------------------------------
>>>
>>
>>
>>
>> --
>>
>> ----------------------------------------------------------------------------
>> Riccardo Albertoni
>> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
>> Magenes"
>> Consiglio Nazionale delle Ricerche
>> via de Marini 6 - 16149 GENOVA - ITALIA
>> tel. +39-010-6475624 - fax +39-010-6475660
>> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
>> Skype: callto://riccardoalbertoni/
>> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
>> www: *MailScanner has detected a possible fraud attempt from
>> "www.ge.imati.cnr.it" claiming to be* http://www.imati.cnr.it/
>> <http://www.ge.imati.cnr.it/Albertoni>
>> http://purl.oclc.org/NET/riccardoAlbertoni
>> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>>
>> ----------------------------------------------------------------------------
>>
>
>
>
> --
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
>
> ----------------------------------------------------------------------------
>



-- 
----------------------------------------------------------------------------
Riccardo Albertoni
Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
Magenes"
Consiglio Nazionale delle Ricerche
via de Marini 6 - 16149 GENOVA - ITALIA
tel. +39-010-6475624 - fax +39-010-6475660
e-mail: Riccardo.Albertoni@ge.imati.cnr.it
Skype: callto://riccardoalbertoni/
LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
www: http://www.imati.cnr.it/ <http://www.ge.imati.cnr.it/Albertoni>
http://purl.oclc.org/NET/riccardoAlbertoni
FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
----------------------------------------------------------------------------
Received on Tuesday, 10 May 2016 13:11:46 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 10 May 2016 13:11:46 UTC