Re: More feedback on DQV from Riccardo Albertoni on 2015-05-04 (public-dwbp-wg@w3.org from May 2015)

From: Riccardo Albertoni <albertoni@ge.imati.cnr.it>
Date: Mon, 4 May 2015 19:01:15 +0200
To: Christophe Guéret <christophe.gueret@dans.knaw.nl>, Antoine Isaac <aisaac@few.vu.nl>
Cc: Phil Archer <phila@w3.org>, "Deirdre Lee (Derilinx)" <deirdre@derilinx.com>, "Debattista, Jeremy" <jeremy.debattista@iais-extern.fraunhofer.de>, DWBP Public List <public-dwbp-wg@w3.org>
Message-ID: <CAOHhXmSkUgO07WCTCQaj8n6X=weOQytW9u0JhdGbBDsHsDzWCg@mail.gmail.com>
Dear Data Quality Vocabulary editors,

you can find a new version of data quality concept schema in the wiki  [1].
It  includes the changes I have done considering the Christophe's
 feedbacks:

- renaming of  property "dqv:refersToDcatDataset" in "dqv:refersToDataset";
- daq:qualityGraph is now a direct specialisation of our
dqv:CollectionOfMetricsResults
(I have opened a issue [2] to remind that we should discuss if  dataset
metrics are a kind of quality info and how to include them in the DVQ);
- added dqv:Feedback as a possible kind of dqv:QualityInfo. (I have opened
an issue to figure out the relation between  dqv:Feedback  and duv:Feedback
[3]).

The  conceptual schema is available as Google Drawing [4]. I guess there is
still a lot to discuss about.
  I have reported In the wiki  part of the discussion we had with
Christophe hoping this might encourage other people to chime in :-)

Regards,
Riccardo

[1] https://www.w3.org/2013/dwbp/wiki/Data_Quality_Vocabulary_(DQV)
[2] https://www.w3.org/2013/dwbp/track/issues/164
[3] https://www.w3.org/2013/dwbp/track/issues/165
[4] http://goo.gl/277o2R

On 25 April 2015 at 13:22, Riccardo Albertoni <albertoni@ge.imati.cnr.it>
wrote:

> Hi Christophe,
> Thanks a lot for your feedbacks, I am happy that the discussion  is
> progressing.
>
> I think Jeremy is going to provide some feedback soon, so it is probably
> better to modify the  schema [2] "all at once" after we get the Jeremy's
> feedbacks.
> The DUV vocabulary is using google graph, which  we might consider as a
> valid  alternative to  creately, in the next  DQV schema release.
>
> In regards of  comments you made in [3]
>
> - I agree on changing  "dqv:refersToDcatDataset" in "dqv:refersToDataset";
>
> -  regarding your <<I think the class inheritance arrows for
> dqv:CollectionOfMetricsResults and dqv:Standard should be inversed.>>
> I am not sure to understand  the rationale behind this suggestion, could
> you explain more?
>
> - regarding your <<Besides, I don't see why the collection of metrics has
> to be a qb:Dataset>>
> Ok, I think we can have daq:qualityGraph as direct specialisation of our
> dqv:CollectionOfMetricsResults.
> However,  I suggest to  open an issue  we can probably   address after
> the DQV FPWD is released , saying "assuming  statistics about dataset  are
>  considered a kind of quality information, as indicated in the UC document,
> see use case bio2rdf [4], are statistics mentioned in  properly modelled in
> the DQV?"
> do you agree?
>
> A couple of quick and inline responses to the comments from your very last
>  email:
>
>
>> # dqv:QualityInfo and its specialisation
>>
>> * is dqv:QualityInfo  the most appropriate name? would be
>> dqv:QualityMetadata more appropriate?
>> Considering that in the BPs document we speak about meta-data maybe is
>> Metadata the best. But on the other hand I'n not keen on hardcoding the
>> notion of metadata in a vocabulary. Going for "QualityData" would be a
>> middle way solution
>>
> mmm...  I have to admit that    I am not very comfortable in   calling it
> qualityData either.  I mean QualityData is  a possible middle way solution,
>   but dqv:QualityInfo/QualityData is currently a superclass of  dqv:ServiceLevelAgreement
> and dqv:Standards  which are not proper data.  for this reason,  I would
> prefer  dqv:QualityInfo rather  than QualityData.
>
>
>> * are there other kinds of quality info that is  worth to consider? e.g.,
>> opinions, report of known issues, ...
>> That would be a good opportunity to link to DUV if we assume that usage
>> correlates positively with dataset quality.
>>
>
> Sure, I have noticed  that in DUV [5] there is a  class duv:Feedback which
> seems to be a good candidate to refer to.
> In the current version of DUV,   duv:Feedback is related
> to dqg:QualityCriteria,  I am not sure what exactly dqg:QualityCriteria
>  stands for, but I guess it is supposed to be a class  in our DQV, and it
> might roughly correspond to our dqv:QualityInfo ( or whatever
> dqv:QualityInfo will be called at the end of our discussion :) )
>
> So once we have a proper name for  our dqv:QualityInfo, and we have
> accomplished our  first internal revision of the DQV,    we can open an
> issues " let's find  an agreement on references between DQV and DUV", which
> should be discussed with the rest of the group, and we might suggest  to
> solve the issue:
> -putting duv:Feedback as one of the possible specialization of
> dqv:QualityInfo in both DQV and DUV Schemas.
> -deleting  dqv:QualityInfo and the relation "describes"  in DUV.
>
> Does this sound reasonable?
>
>
>> * how the service level agreement can be represented? it is a document on
>> the web to refer to or we want to refer to something more structured? is
>> there any specific property we should add to dqv:ServiceLevelAgreement?
>> According to [1] "An SLA is best described as a collection of promises".
>> It is also a document which just lists a couple of things. We could either
>> focus on the document aspect as a whole or try to model the list of
>> promises and the list of related concepts. My gut feeling is that this
>> could lead to writing an elaborated vocabulary that would span out of our
>> scope so I'd say we should rather not do that. But we should nonetheless
>> anticipate that someone may some day want to work on that.
>>
> What about then either not indicating any range or set Resource as a range
>> ? Then everyone is free to model an SLA as he wants. And in the BPs would
>> could hint that structured data is better and a PDF also ok.
>>
>> Very good point! I agree on leaving it open to further  modelling of
> promises by explicitly mention this possibility,  it  sounds like a very
> good idea. Concerning your proposal to not indicating any range or set
> Resource as a range, I think we should at least suggest a concrete  unique
> way to include  a sla human readable descriptions, so that people don't
> have the chance to be too much creative attaching  their html, pdf or
>  whatever in the quality-related  metadata.
>
>
>> * how standard are represented under dqv:Standard?  is the class
>> dqv:Standard suitable to include ODI certificates,  to represent that a
>> DCAT dataset has a certain   compliancy to  5 LOD stars, and other kind of
>> bets practices? should we explicitly provide a list/taxonomy of standard/
>> certificated to consider?  is the class dqv:Standard really necessary or we
>> can rely directly on dcterm:Standard?
>> We should be flexible here. If we impose a specific class then ODI
>> certificates and 5star models will have to subclass from it, so we should
>> make it generic enough conceptually so that this works. To that respect I
>> think dcterm:Standard is quite nice so we may want to re-use it. Or
>> subclass from it our own Standard which is a verbatim copy of it in case
>> some day the meaning of dcterm:Standard changes in a way that brake our
>> vocabulary.
>>
>> I see you point,  I agree we have to be flexible and generic enough but I
> am not sure to understand what you are suggesting here.
>
>
>> * can we assume the following constraint ?
>>   x a dcat:Dataset. x dcterms:conformsTo y '''imply'''  x hasQualityInfo
>> y. y dqv:Standard
>> Think so. We should have more of these BTW :-)
>>
>
> Sure, I think other constraints will come out quite naturally during the
> design process. I am not very concerned about it, by the way, I'd like that
> anyone working on DQV feels free to suggest some ;)
>
> Have a nice weekend,
> Riccardo
>
>
>>
>> Cheers,
>> Christophe
>>
>>
>>
>>
>> [1]
>> http://www.knowledgetransfer.net/dictionary/ITIL/en/Service_Level_Agreement.htm
>>
>>
>> --
>> Christophe Guéret
>>
>> --
>> This message has been scanned for viruses and dangerous content by
>> *E.F.A. Project* <http://www.efa-project.org>, and is believed to be
>> clean.
>
>
>
> [2] https://creately.com/diagram/i8lgl90p1/AXwUzXKQOHvEwUw9eJmxrndw%3D
> [3] https://lists.w3.org/Archives/Public/public-dwbp-wg/2015Apr/0209.html
> [4] http://w3c.github.io/dwbp/usecasesv1.html#UC-Bio2RDF
> [5]
> https://docs.google.com/drawings/d/1aq3vPcoj0SPs5BispD6umQNejrBTwkhsSYu6Y1adUjw/edit
>
> --
>
> ----------------------------------------------------------------------------
> Riccardo Albertoni
> Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
> Magenes"
> Consiglio Nazionale delle Ricerche
> via de Marini 6 - 16149 GENOVA - ITALIA
> tel. +39-010-6475624 - fax +39-010-6475660
> e-mail: Riccardo.Albertoni@ge.imati.cnr.it
> Skype: callto://riccardoalbertoni/
> LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
> www: http://www.ge.imati.cnr.it/Albertoni
> http://purl.oclc.org/NET/riccardoAlbertoni
> FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
>



-- 
----------------------------------------------------------------------------
Riccardo Albertoni
Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
Magenes"
Consiglio Nazionale delle Ricerche
via de Marini 6 - 16149 GENOVA - ITALIA
tel. +39-010-6475624 - fax +39-010-6475660
e-mail: Riccardo.Albertoni@ge.imati.cnr.it
Skype: callto://riccardoalbertoni/
LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
www: http://www.ge.imati.cnr.it/Albertoni
http://purl.oclc.org/NET/riccardoAlbertoni
FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
Received on Monday, 4 May 2015 17:01:41 UTC