W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > June 2015

Re: New DQV editor's draft

From: Christophe Guéret <christophe.gueret@dans.knaw.nl>
Date: Wed, 3 Jun 2015 01:00:33 +0100
Message-ID: <CABP9CAG2XOG8Y6LPRgspRDJSknMFSdmxH2nmNz=9gp-3f_AE+Q@mail.gmail.com>
To: Antoine Isaac <aisaac@few.vu.nl>
CC: "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Hi Makx,

Good point indeed but I think subclassing would not be necessary. In fact
we could even go without the dqv:SLA and just suggest dct:conformsTo to
link a dataset or its distribution to, among other, things SLAs. The case
of ODI certificates is dealt with with the dqv:Certificate predicate and
matching classes. So we could have dct:conformsTo dct:Standard for all the
agreement-level objects (standards, SLAs, legal, ...) and
Feedback/Annotation/Certificate for external appreciation of the data or
its distribution (certificates, stars, comments, ...)

Christophe


On 2 June 2015 at 22:11, Antoine Isaac <aisaac@few.vu.nl> wrote:

> Hi Makx,
>
> Thanks for the comment!
> I understand the concern. But I thought that requiring the publisher of
> SLA metadata to type the object of dcterms:conformsTo statements as
> instance of dqv:ServiceLevelAgreement would be enough to distinguish the
> SLAs from other types of resources potentially appearing as object of
> dcterms:conformsTo.
> To me adding a new sub-property of dcterms:conforms to would be slightly
> redundant, even.
>
> What do you think?
> We have not discussed it with Christophe and Riccardo yet.
> Of course if you (and Ghislain) remain unconvinced then it's probably a
> sign that the current proposal is less effective than what I thought.
>
> Best,
>
> Antoine
>
> On 5/29/15 1:57 PM, Makx Dekkers wrote:
> > One comment: I see in the latest draft that the proposal is to use
> > dct:conformsTo linking to dqv:ServiceLevelAgreement, subclass of
> > dct:Standard.
> >
> > I'd like to warn against overloading dct:conformsTo. I have heard
> > suggestions in various groups to use dct:conformsTo for linking to an
> > ODI-style open data certificate, to a legal basis for the publication of
> a
> > Dataset, to temporal and spatial reference systems and to other types of
> > specifications that have relevance for the understanding of the Dataset.
> In
> > addition, conformsTo has also been suggested to link CatalogRecord to
> > implementation guidelines and to an Application Profile that the
> metadata is
> > based on. I am afraid that the processing of this information will not be
> > possible if all kinds of 'standards' are lumped together.
> >
> > Would it be better in this case to create a 'local' property dqv:hasSLA
> as a
> > subproperty of dct:conformsTo with range dqv:SLA that is a subclass of
> > dct:Standard? It is then clear what the relationship is and what it is
> used
> > for.
> >
> > Makx.
> >
> >
> >
> >> -----Original Message-----
> >> From: Antoine Isaac [mailto:aisaac@few.vu.nl]
> >> Sent: 29 May 2015 00:02
> >> To: Public DWBP WG
> >> Subject: Re: New DQV editor's draft
> >>
> >> Hi Jeremy,
> >>
> >> Thanks a lot!
> >>
> >> I have produce a new diagram with the Dataset out of the MetadataQuality
> >> graph (even though I'm still convinced in terms of RDF statements it
> > doesn't
> >> make any difference). And added dqv:hasMetadataQuality as you suggested.
> >> It's at http://w3c.github.io/dwbp/vocab-dqg.html
> >> I hope I've captured your suggestions right.
> >>
> >> About dqv:QualityMetadata and daq:QualityGraph, you say:
> >> [
> >> That is why I suggested that dqv:QualityMetadata to be a subclass of the
> >> daq:QualityGraph instead of rdfg:Graph, because QualityMetadata will
> >> contain what a daq:QualityGraph should have + more information such as
> >> having dcterms:Standard, dqv:Feedback etc..
> >> ]
> >> In fact here we might have a different interpretation of the semantics
> of
> >> asserting sub-class links between types of graphs.
> >> daq:QualityGraph is defined by "Defines a quality graph which will
> contain
> > all
> >> metadata about quality metrics on the dataset. "
> >> daq:QualityGraph is a subclass of qb:Dataset, itself defined by
> > "Represents a
> >> collection of observations, possibly organized into various slices,
> > conforming
> >> to some common dimensional structure"
> >>
> >> I'd understand from these definitions that it is *not* welcome that an
> >> instance of daq:QualityGraph (or a subclass of it) would contain data
> that
> > is
> >> not about metrics. Hence I was lukewarm on declaring dqv:QualityMetadata
> >> a subclass of daq:QualityGraph!
> >>
> >> But if everyone thinks that a subclass of a class of graphs that contain
> >> statements of a certain type may contain statements of other types than
> > the
> >> ones its super-class contains (in addition to these), then I'm alright
> > with your
> >> suggestion!
> >>
> >> (this assumes that everyone can parse the above sentence of course  :-)
> >>
> >> cheers,
> >>
> >> Antoine
> >>
> >> On 5/28/15 1:03 PM, Debattista, Jeremy wrote:
> >>> Hi Antoine,
> >>>
> >>> Thanks for your replies. Things are more clear for me now. I will reply
> > to
> >> some of your comments.
> >>>
> >>>> 1. I agree that the dcat:Dataset instance is not quality metadata per
> > se.
> >> We've got a representation problem... The idea was that it is the
> > statements
> >> (e.g. one dcterms:conformsTo) that are in the quality graph.
> >>>> Actually I don't think an instance can be said to be in the graph.
> It's
> > only
> >> statements that are contained in the graph.
> >>>> Now, I have to say that I'm not sure how best we could represent this,
> >> graphically.
> >>>> Has anyone got an idea?
> >>>
> >>> I gave it a go. See att1.jpg. Bascally I've pushed out the orange box,
> > but still
> >> left the conformsTo, hasQualityMeasure, hasFeedback statements inside
> the
> >> graph. Would this be more clear? I've added the dqv:hasQualityMetadata
> as
> >> well (this does not have to be in the quality metadata itself).
> >>>
> >>>> 2. "then the dcat:Dataset points to the quality metadata graph" calls
> > for
> >> introducing another property dqv:hasQualityMetadata or something like
> > this.
> >>>> This is an interesting idea. If there's enough positive feedback, we
> > could
> >> add it. I'm adding a note right now on it.
> >>>>
> >>>> But I wouldn't be in favour of using it as a replacement for the
> direct
> > links
> >> between the dcat:Dataset and dcterms:Standard, dqv:QualityMeasure, etc.
> >>>> The idea is indeed to have a pattern that allows containment of all
> > quality
> >> statements (to allow for provenance tracking) while not putting this
> >> containment as a hurdle for these who are less interested in it.
> >>>> Say, if a Dataset comes with a SLA, I prefer to have a direct
> statement
> >> between the instance of dcat:Dataset and the instance of dqv:SLA.
> >> Otherwise one would have to retrieve and combine two statements:
> >>>> - a link between a Dataset and a QualityGraph
> >>>> - a statement that relates the QualityGraph with the SLA.
> >>>> Not only this is a longer path, but one of the nodes is a graph, and
> >>>> this could raise issues for these who are less comfortable with
> >>>> graphs (including all of these who don't want to handle RDF syntaxes
> >>>> for graphs!)
> >>>
> >>> I agree with your concerns - I was not viewing it from the provenance
> >> perspective. But on the other hand, in my opinion the extra property
> >> (hasQualityMetadata) wouldn't hurt neither - even though it might be
> >> redundant at the end of the day.
> >>>
> >>>> C. There is a raised issue that says:
> >>>> [
> >>>> The label of daq:QualityGraph does not fit well with the current
> model.
> >> DAQ graphs are meant to contain measures. In our context a "quality
> graph"
> >> has a wider scope: actually the role of representing overall quality
> > graphs is
> >> currently played by dqv:QualityMetadata.
> >>>> ]
> >>>> I think the same initial argument applies to the suggestion of making
> >> dqv:QualityMedata a subclass of daq:QualityGraph. DAQ's quality graph
> >> contain metadata about quality metrics on the dataset. I believe that
> > there is
> >> quality metadata that is not metrics. At least that's how we have
> started
> > to
> >> approach the problem. I'd be very eager to hear whether you think this
> is
> > not
> >> right!
> >>>
> >>> To be honest, I prefer the dqv:QualityMetadata term and the idea behind
> > it
> >> much more. Its intended use is more suitable in this case than the
> >> daq:QualityGraph. The daq:QualityGraph is just a "special" RDF graph
> which
> > is
> >> also a cube dataset and as you rightly pointed out, it contains metadata
> > about
> >> quality metrics. My understanding of subclasses is "inheriting from the
> >> parent class and more". That is why I suggested that dqv:QualityMetadata
> > to
> >> be a subclass of the daq:QualityGraph instead of rdfg:Graph, because
> >> QualityMetadata will contain what a daq:QualityGraph should have + more
> >> information such as having dcterms:Standard, dqv:Feedback etc..  Am I
> > right
> >> about this?
> >>>
> >>> Cheers,
> >>> Jer
> >>>
> >>>
> >>>
> >>> On 25 May 2015, at 23:26, Antoine Isaac <aisaac@few.vu.nl
> >> <mailto:aisaac@few.vu.nl>> wrote:
> >>>
> >>>> Hi Jeremy,
> >>>>
> >>>>
> >>>>
> >>>> On 5/22/15 10:45 AM, Debattista, Jeremy wrote:
> >>>>>
> >>>>> This looks great already.
> >>>>
> >>>>
> >>>> Thanks!
> >>>>
> >>>>
> >>>>> I would like to point out two issues which are not clear to me as
> yet:
> >>>>>
> >>>>> 1) In the diagram, shouldn't the dcat:Dataset be "outside" of the
> > quality
> >> metadata (and especially outside of the QualityGraph containment), and
> >> then the dcat:Dataset points to the quality metadata graph?
> >>>>>
> >>>>> I don't know if this was done on purpose there or should have been
> >> placed outside. If a dcat:Dataset (or distribution) is inside the
> quality
> >> metadata boundaries, then my understanding as a consumer (I might be a
> >> machine) would be that a dcat:Dataset instance is some kind of quality
> >> information.
> >>>>
> >>>>
> >>>> This is a tricky issue, which calls on two answers:
> >>>>
> >>>> 1. I agree that the dcat:Dataset instance is not quality metadata per
> > se.
> >> We've got a representation problem... The idea was that it is the
> > statements
> >> (e.g. one dcterms:conformsTo) that are in the quality graph.
> >>>> Actually I don't think an instance can be said to be in the graph.
> It's
> > only
> >> statements that are contained in the graph.
> >>>> Now, I have to say that I'm not sure how best we could represent this,
> >> graphically.
> >>>> Has anyone got an idea?
> >>>>
> >>>>
> >>>>
> >>>> 2. "then the dcat:Dataset points to the quality metadata graph" calls
> > for
> >> introducing another property dqv:hasQualityMetadata or something like
> > this.
> >>>> This is an interesting idea. If there's enough positive feedback, we
> > could
> >> add it. I'm adding a note right now on it.
> >>>>
> >>>> But I wouldn't be in favour of using it as a replacement for the
> direct
> > links
> >> between the dcat:Dataset and dcterms:Standard, dqv:QualityMeasure, etc.
> >>>> The idea is indeed to have a pattern that allows containment of all
> > quality
> >> statements (to allow for provenance tracking) while not putting this
> >> containment as a hurdle for these who are less interested in it.
> >>>> Say, if a Dataset comes with a SLA, I prefer to have a direct
> statement
> >> between the instance of dcat:Dataset and the instance of dqv:SLA.
> >> Otherwise one would have to retrieve and combine two statements:
> >>>> - a link between a Dataset and a QualityGraph
> >>>> - a statement that relates the QualityGraph with the SLA.
> >>>> Not only this is a longer path, but one of the nodes is a graph, and
> >>>> this could raise issues for these who are less comfortable with
> >>>> graphs (including all of these who don't want to handle RDF syntaxes
> >>>> for graphs!)
> >>>>
> >>>>
> >>>>> 2) How about doing dqv:QualityMetadata as a subclass of
> >> daq:QualityGraph?
> >>>>>
> >>>>> There are a number of advantages of doing so. First of all we don't
> > have
> >> to rely on multiple graphs. Although nothing is wrong with that, this
> > might
> >> make querying a bit harder. The daq:QualityGraph is a specialisation of
> > the
> >> rdf:Graph which is also a qb:Dataset. In this case the qb:dataset
> property
> > can
> >> have dqv:QualityMeasure as domain and dqv:QualityMetadata as its range.
> >> This way we can move dcat:Dataset from the graph containment, and
> >> removing the property "dqv:hasQualityMeasure" (this becomes redundant
> >> as it can be inferred, if there is some link between dcat:Dataset and
> >> dqv:QualityMetadata).
> >>>>>
> >>>>
> >>>>
> >>>> This is quite related to the previous issues. Interesting discussion!
> >>>> I've tried to make a graph representing your proposal, in the attached
> > file.
> >>>> It's quite hard to untangle though. I'll have a try, please tell me
> >>>> if I'm making any sense :-)
> >>>>
> >>>> A. As said above the instance of dcat:Dataset is not meant to be
> > contained
> >> in the quality metadata graph. Maybe this alleviates some of your
> > concerns...
> >>>>
> >>>> B. I agree that if there was a link between the instance of
> > dcat:Dataset
> >> and the (merged) instance dqv:QualityMetadata/daq:QualityGraph, then
> >> with a qb:dataset statement between instances of dqv:QualityMeasure and
> >> the instance of dqv:QualityMetadata/daq:QualityGraph one could indeed
> >> find a connection between the dcat:Dataset and the instances of
> >> dqv:QualityMeasure.
> >>>> But as said above I don't like the idea of removing the direct link.
> If
> > a
> >> dataset has some measure, say, a number of incorrect triples, why remove
> >> the direct link?
> >>>> This would put the provenance info in the way of the applications that
> > are
> >> less concerned about provenance.
> >>>> As also said, I'm not against having the link between dcat:Dataset and
> > the
> >> (merged) instance dqv:QualityMetadata/daq:QualityGraph. But I wouldn't
> >> want to use it as a motivation for removing dqv:hasQualityMeasure.
> >>>>
> >>>> C. There is a raised issue that says:
> >>>> [
> >>>> The label of daq:QualityGraph does not fit well with the current
> model.
> >> DAQ graphs are meant to contain measures. In our context a "quality
> graph"
> >> has a wider scope: actually the role of representing overall quality
> > graphs is
> >> currently played by dqv:QualityMetadata.
> >>>> ]
> >>>> I think the same initial argument applies to the suggestion of making
> >> dqv:QualityMedata a subclass of daq:QualityGraph. DAQ's quality graph
> >> contain metadata about quality metrics on the dataset. I believe that
> > there is
> >> quality metadata that is not metrics. At least that's how we have
> started
> > to
> >> approach the problem. I'd be very eager to hear whether you think this
> is
> > not
> >> right!
> >>>>
> >>>>
> >>>>>
> >>>>> Once we have a first draft of the RDF schema, I will be happy to
> > support
> >> it in our Quality Assessment Framework.
> >>>>
> >>>>
> >>>> This would be great!
> >>>>
> >>>> Thanks again for the comments - I hope I will not have discouraged
> >>>> you by the length of the answers :)
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Antoine
> >>>>
> >>>>>
> >>>>> On 20 May 2015, at 23:39, Antoine Isaac <aisaac@few.vu.nl
> >> <mailto:aisaac@few.vu.nl>> wrote:
> >>>>>
> >>>>>> Dear all,
> >>>>>>
> >>>>>> We've created a new editor's draft of the Data Quality Vocabulary on
> >> Github [1].
> >>>>>>
> >>>>>> Most of it is in the diagram in section 3. We have placeholder for
> >> material in other sections, but this is still work in progress.
> >>>>>>
> >>>>>> As you can see the diagram and the doc still have a lot of open
> >>>>>> issues and questions. But we believe it's a positive evolution from
> > the
> >> previous version [2]. The patterns that we would like to use are
> > stabilizing
> >> Actually I'm curious to see how much of Jeremy's last comments [3] would
> >> still apply!
> >>>>>>
> >>>>>> Needless to say, everyone else's feedback is highly welcome!
> >>>>>>
> >>>>>> Please excuse the discussion notes in the diagram itself. We thought
> > of
> >> creating a wiki page as we had done previously [2]. But I lacked the
> time
> > to
> >> do it. Maybe in the coming days, depending on how the discussion
> > evolves...
> >>>>>>
> >>>>>> Cheers,
> >>>>>>
> >>>>>> Antoine, on behalf of co-editors Riccardo and Christophe
> >>>>>>
> >>>>>> [1] http://w3c.github.io/dwbp/vocab-dqg.html
> >>>>>> [2]
> >>>>>>
> >> https://www.w3.org/2013/dwbp/wiki/Data_Quality_Vocabulary_%28DQV%
> >> 29
> >>>>>> [3]
> >>>>>> http://lists.w3.org/Archives/Public/public-dwbp-
> >> wg/2015May/0037.htm
> >>>>>> l
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>> <JeremysProposal-150525.png>
> >>>
> >
> >
> >
>
>


-- 
Onderzoeker
DANS, Anna van Saksenlaan 51, 2593 HW Den Haag
+31(0)6 14576494
christophe.gueret@dans.knaw.nl


*Data Archiving and Networked Services (DANS/KNAW)*[image:
http://dans.knaw.nl] <http://dans.knaw.nl>

*e-Humanities Group (KNAW)*
[image: eHumanities] <http://www.ehumanities.nl/>

*World Wide Semantic Web community*
http://worldwidesemanticweb.org/
Received on Wednesday, 3 June 2015 00:01:26 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 3 June 2015 00:01:26 UTC