- From: Riccardo Albertoni <albertoni@ge.imati.cnr.it>
- Date: Mon, 4 May 2015 19:01:15 +0200
- To: Christophe Guéret <christophe.gueret@dans.knaw.nl>, Antoine Isaac <aisaac@few.vu.nl>
- Cc: Phil Archer <phila@w3.org>, "Deirdre Lee (Derilinx)" <deirdre@derilinx.com>, "Debattista, Jeremy" <jeremy.debattista@iais-extern.fraunhofer.de>, DWBP Public List <public-dwbp-wg@w3.org>
- Message-ID: <CAOHhXmSkUgO07WCTCQaj8n6X=weOQytW9u0JhdGbBDsHsDzWCg@mail.gmail.com>
Dear Data Quality Vocabulary editors, you can find a new version of data quality concept schema in the wiki [1]. It includes the changes I have done considering the Christophe's feedbacks: - renaming of property "dqv:refersToDcatDataset" in "dqv:refersToDataset"; - daq:qualityGraph is now a direct specialisation of our dqv:CollectionOfMetricsResults (I have opened a issue [2] to remind that we should discuss if dataset metrics are a kind of quality info and how to include them in the DVQ); - added dqv:Feedback as a possible kind of dqv:QualityInfo. (I have opened an issue to figure out the relation between dqv:Feedback and duv:Feedback [3]). The conceptual schema is available as Google Drawing [4]. I guess there is still a lot to discuss about. I have reported In the wiki part of the discussion we had with Christophe hoping this might encourage other people to chime in :-) Regards, Riccardo [1] https://www.w3.org/2013/dwbp/wiki/Data_Quality_Vocabulary_(DQV) [2] https://www.w3.org/2013/dwbp/track/issues/164 [3] https://www.w3.org/2013/dwbp/track/issues/165 [4] http://goo.gl/277o2R On 25 April 2015 at 13:22, Riccardo Albertoni <albertoni@ge.imati.cnr.it> wrote: > Hi Christophe, > Thanks a lot for your feedbacks, I am happy that the discussion is > progressing. > > I think Jeremy is going to provide some feedback soon, so it is probably > better to modify the schema [2] "all at once" after we get the Jeremy's > feedbacks. > The DUV vocabulary is using google graph, which we might consider as a > valid alternative to creately, in the next DQV schema release. > > In regards of comments you made in [3] > > - I agree on changing "dqv:refersToDcatDataset" in "dqv:refersToDataset"; > > - regarding your <<I think the class inheritance arrows for > dqv:CollectionOfMetricsResults and dqv:Standard should be inversed.>> > I am not sure to understand the rationale behind this suggestion, could > you explain more? > > - regarding your <<Besides, I don't see why the collection of metrics has > to be a qb:Dataset>> > Ok, I think we can have daq:qualityGraph as direct specialisation of our > dqv:CollectionOfMetricsResults. > However, I suggest to open an issue we can probably address after > the DQV FPWD is released , saying "assuming statistics about dataset are > considered a kind of quality information, as indicated in the UC document, > see use case bio2rdf [4], are statistics mentioned in properly modelled in > the DQV?" > do you agree? > > A couple of quick and inline responses to the comments from your very last > email: > > >> # dqv:QualityInfo and its specialisation >> >> * is dqv:QualityInfo the most appropriate name? would be >> dqv:QualityMetadata more appropriate? >> Considering that in the BPs document we speak about meta-data maybe is >> Metadata the best. But on the other hand I'n not keen on hardcoding the >> notion of metadata in a vocabulary. Going for "QualityData" would be a >> middle way solution >> > mmm... I have to admit that I am not very comfortable in calling it > qualityData either. I mean QualityData is a possible middle way solution, > but dqv:QualityInfo/QualityData is currently a superclass of dqv:ServiceLevelAgreement > and dqv:Standards which are not proper data. for this reason, I would > prefer dqv:QualityInfo rather than QualityData. > > >> * are there other kinds of quality info that is worth to consider? e.g., >> opinions, report of known issues, ... >> That would be a good opportunity to link to DUV if we assume that usage >> correlates positively with dataset quality. >> > > Sure, I have noticed that in DUV [5] there is a class duv:Feedback which > seems to be a good candidate to refer to. > In the current version of DUV, duv:Feedback is related > to dqg:QualityCriteria, I am not sure what exactly dqg:QualityCriteria > stands for, but I guess it is supposed to be a class in our DQV, and it > might roughly correspond to our dqv:QualityInfo ( or whatever > dqv:QualityInfo will be called at the end of our discussion :) ) > > So once we have a proper name for our dqv:QualityInfo, and we have > accomplished our first internal revision of the DQV, we can open an > issues " let's find an agreement on references between DQV and DUV", which > should be discussed with the rest of the group, and we might suggest to > solve the issue: > -putting duv:Feedback as one of the possible specialization of > dqv:QualityInfo in both DQV and DUV Schemas. > -deleting dqv:QualityInfo and the relation "describes" in DUV. > > Does this sound reasonable? > > >> * how the service level agreement can be represented? it is a document on >> the web to refer to or we want to refer to something more structured? is >> there any specific property we should add to dqv:ServiceLevelAgreement? >> According to [1] "An SLA is best described as a collection of promises". >> It is also a document which just lists a couple of things. We could either >> focus on the document aspect as a whole or try to model the list of >> promises and the list of related concepts. My gut feeling is that this >> could lead to writing an elaborated vocabulary that would span out of our >> scope so I'd say we should rather not do that. But we should nonetheless >> anticipate that someone may some day want to work on that. >> > What about then either not indicating any range or set Resource as a range >> ? Then everyone is free to model an SLA as he wants. And in the BPs would >> could hint that structured data is better and a PDF also ok. >> >> Very good point! I agree on leaving it open to further modelling of > promises by explicitly mention this possibility, it sounds like a very > good idea. Concerning your proposal to not indicating any range or set > Resource as a range, I think we should at least suggest a concrete unique > way to include a sla human readable descriptions, so that people don't > have the chance to be too much creative attaching their html, pdf or > whatever in the quality-related metadata. > > >> * how standard are represented under dqv:Standard? is the class >> dqv:Standard suitable to include ODI certificates, to represent that a >> DCAT dataset has a certain compliancy to 5 LOD stars, and other kind of >> bets practices? should we explicitly provide a list/taxonomy of standard/ >> certificated to consider? is the class dqv:Standard really necessary or we >> can rely directly on dcterm:Standard? >> We should be flexible here. If we impose a specific class then ODI >> certificates and 5star models will have to subclass from it, so we should >> make it generic enough conceptually so that this works. To that respect I >> think dcterm:Standard is quite nice so we may want to re-use it. Or >> subclass from it our own Standard which is a verbatim copy of it in case >> some day the meaning of dcterm:Standard changes in a way that brake our >> vocabulary. >> >> I see you point, I agree we have to be flexible and generic enough but I > am not sure to understand what you are suggesting here. > > >> * can we assume the following constraint ? >> x a dcat:Dataset. x dcterms:conformsTo y '''imply''' x hasQualityInfo >> y. y dqv:Standard >> Think so. We should have more of these BTW :-) >> > > Sure, I think other constraints will come out quite naturally during the > design process. I am not very concerned about it, by the way, I'd like that > anyone working on DQV feels free to suggest some ;) > > Have a nice weekend, > Riccardo > > >> >> Cheers, >> Christophe >> >> >> >> >> [1] >> http://www.knowledgetransfer.net/dictionary/ITIL/en/Service_Level_Agreement.htm >> >> >> -- >> Christophe Guéret >> >> -- >> This message has been scanned for viruses and dangerous content by >> *E.F.A. Project* <http://www.efa-project.org>, and is believed to be >> clean. > > > > [2] https://creately.com/diagram/i8lgl90p1/AXwUzXKQOHvEwUw9eJmxrndw%3D > [3] https://lists.w3.org/Archives/Public/public-dwbp-wg/2015Apr/0209.html > [4] http://w3c.github.io/dwbp/usecasesv1.html#UC-Bio2RDF > [5] > https://docs.google.com/drawings/d/1aq3vPcoj0SPs5BispD6umQNejrBTwkhsSYu6Y1adUjw/edit > > -- > > ---------------------------------------------------------------------------- > Riccardo Albertoni > Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico > Magenes" > Consiglio Nazionale delle Ricerche > via de Marini 6 - 16149 GENOVA - ITALIA > tel. +39-010-6475624 - fax +39-010-6475660 > e-mail: Riccardo.Albertoni@ge.imati.cnr.it > Skype: callto://riccardoalbertoni/ > LinkedIn: http://www.linkedin.com/in/riccardoalbertoni > www: http://www.ge.imati.cnr.it/Albertoni > http://purl.oclc.org/NET/riccardoAlbertoni > FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf > -- ---------------------------------------------------------------------------- Riccardo Albertoni Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico Magenes" Consiglio Nazionale delle Ricerche via de Marini 6 - 16149 GENOVA - ITALIA tel. +39-010-6475624 - fax +39-010-6475660 e-mail: Riccardo.Albertoni@ge.imati.cnr.it Skype: callto://riccardoalbertoni/ LinkedIn: http://www.linkedin.com/in/riccardoalbertoni www: http://www.ge.imati.cnr.it/Albertoni http://purl.oclc.org/NET/riccardoAlbertoni FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
Received on Monday, 4 May 2015 17:01:41 UTC