Re: Some thoughts on the Q&G vocab from Bernadette Farias Lóscio on 2014-05-06 (public-dwbp-vocabs@w3.org from May 2014)

From: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
Date: Tue, 6 May 2014 10:15:36 -0300
To: "Debattista, Jeremy" <Jeremy.Debattista@iais-extern.fraunhofer.de>
Cc: Phil Archer <phila@w3.org>, Steven Adler <adler1@us.ibm.com>, Eric Kauz <eric.kauz@gs1.org>, DWBP Vocabs <public-dwbp-vocabs@w3.org>
Message-ID: <CANx1PzwwN0cWbFM0Tc6LuK7dxEWdnW_Ht=e59pTa2VEKwgF6ZQ@mail.gmail.com>
Hi Jeremy,

Great work! I agree with you that we shouldn't duplicate efforts and that
we should consider a domain-independent approach. I guess that categories
and dimensions are domain-independent, however metrics may be
domain-dependent. For example, some of the metrics described in [1] are
specific to linked data. In this case, we may have more than one metric for
the same dimension defined according to the domain (or data model, for
example). Does it make sense for you?

I saw that there are more metrics than dimensions. Could you please explain
the relation between metrics and dimensions? I'd like to understand if the
different matric values are combined to produce a single value, for
example: IntensionalConcisenessMetric and ExtensionalConcisenessMetric are
combined to produce a value for Conciseness?

I was also wondering why you didn't consider metrics like completeness and
correctness in your proposal. In [2], we discussed the use of those
dimensions to evaluate the relevance of web data sources.

I think it is also possible to include the quality metadata (using daQ)  to
CKAN together with the information about data usage.

kind regards,
Bernadette

[1] https://raw.githubusercontent.com/diachron/quality/master/src/<https://raw.githubusercontent.com/diachron/quality/master/src/main/resources/vocabularies/dqm/dqm.trig>
[2]
https://drive.google.com/file/d/0BxTZf3B9yQ3oZDdWVDEwRUk5QzQ/edit?usp=sharing


2014-05-03 7:35 GMT-03:00 Debattista, Jeremy <
Jeremy.Debattista@iais-extern.fraunhofer.de>:

>  Hi Phil,
>
>  We already defined most of the concepts [1] (probably not in that
> detail) you are currently specifying, which will be used for the DIACHRON
> project (@Steve, we have 4 “customers (project partners)” there -
> Datapublica[2], Data Market[3], EBI[4] and Brox[5]). We will have a
> namespace as well for that (probably a purl namespace for now). Those are
> also based on the conceptual daQ model (linked by Ghislain in this thread)
> I presented to you in our f2f meeting.
>
>  I suggest that rather than duplicate the effort, we should first define
> a number of domain-independent quality dimensions (some customers would
> require different dimensions and metrics). Then we could identify the
> properties for each of these dimensions (we could either use Makx’s work
> and Amrapali’s [6] survey paper). If daQ is used, then it should also be
> easier to integrate quality metadata to CKAN.
>
>  @Phil, I’d also be happy to help you with the ontology modelling..
>
>  Cheers,
> Jer
>
>
>  [1]
> https://raw.githubusercontent.com/diachron/quality/master/src/main/resources/vocabularies/dqm/dqm.trig
> [2] http://www.data-publica.com
> [3] https://datamarket.com
> [4] http://www.ebi.ac.uk <https://datamarket.com>
> [5] http://brox.de <https://datamarket.com>
> [6] http://www.semantic-web-journal.net/system/files/swj556.pdf
>
>  On 02 May 2014, at 18:43, Steven Adler <adler1@us.ibm.com> wrote:
>
> Phil,
>
> Good work!
>
> When we have vocabulary specifications ready to be tested, I can find a
> "customer" to test it as an implementation use case that we can document to
> make "recommendations."
>
>
> Best Regards,
>
> Steve
>
> Motto: "Do First, Think, Do it Again"
>
>
>   From:  Phil Archer <phila@w3.org>   To:  DWBP Vocabs <
> public-dwbp-vocabs@w3.org>, Eric Kauz <eric.kauz@gs1.org>, Bernadette
> Farias Lóscio <bfl@cin.ufpe.br>   Date:  05/02/2014 05:45 PM   Subject:  Some
> thoughts on the Q&G vocab
> ------------------------------
>
>
>
> Dear all,
>
> As mentioned on today's call, I've been looking at the data quality and
> granularity vocabulary. Taking the discussion at the f2f meeting [1],
> Makx's work under the Eu ISA Programme [2] and the ODI Certificates [3]
> as my starting points, I worked through the issues and made notes in the
> wiki. Based on that I then created the diagram. All of which is
> available at [4].
>
> Eric - you kindly offered to help with the UML modelling, thank you.
> I've used Enterprise Architect for this - is that what you use by any
> chance?
>
> I think there are several high level talking points:
>
> 1. What are we trying to achieve - machine readability? Links to human
> readable documentation? Objectivity? Subjectivity?
>
> 2. How are we going to test this? Bernadette is building a CKAN
> extension for the data usage vocab - Bernadette - can it take on this
> vocab as well? (I hope so). The plan so far is for the two vocabs to be
> Notes, not Recommendations. That means we don't have to prove
> implementation. However... without implementation nothing is a standard
> and if we can take the vocabs through to Recommendation (i.e. prove
> multiple implementations) then they'll have a lot more weight.
>
> Any and all comments welcome. If focussing on a particular issue, please
> start a new thread.
>
> Cheers
>
> Phil.
>
>
> [1] http://www.w3.org/2013/meeting/dwbp/2014-04-01#Data_quality_task_force
>
> [2] http://www.slideshare.net/OpenDataSupport/open-data-quality-29248578
> (Slide 8)
> [3] https://certificates.theodi.org/overview
> [4]
>
> https://www.w3.org/2013/dwbp/wiki/Quality_and_Granularity_Description_Vocabulary
>
> --
>
>
> Phil Archer
> W3C Data Activity Lead
> http://www.w3.org/2013/data/
>
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
>
>
>
>
>


-- 
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------
Received on Tuesday, 6 May 2014 13:17:59 UTC