Re: Alignment between DWBP vocabularies and HCLS profile

Hi,

I've organized the HCLS recommendations within the context of the DQV
section "Dimensions and metrics hints"


7.1 statistics
http://www.w3.org/TR/hcls-dataset/#s6_6

core:
# of triples
# of unique, typed entities
# of unique subjects
# of unique properties
# of unique objects
# of unique classes (types)
# of unique literals
# of unique graphs

advanced
classes + # of instances
properties + # of triples
properties + # of unique, typed subjects + # triples
properties + # of unique, typed objects + # triples
properties + # of unique, typed subjects + # of unique, typed objects

7.2 availability
* foaf:page
* dct:distribution
* dct:license
* dct:language
* dcat:downloadURL
* void:sparqlEndpoint
* dcat:landingPage
* idot:accessPattern
* idot:exampleIdentifier
* void:exampleResource

7.3 processability
* dct:Format

7.4 accuracy

7.5 consistency

7.6 relevance
* dcat:theme
* dcat:keyword

7.7 completeness


7.8 conformance
* dct:conformsTo
* void:vocabulary

7.9 credibility
* dct:publisher
* dct:contributor
* dct:creator
* dct:references
* cito:citesAsAuthority

7.10 timelineness
* dct:accrualPeriodicity
* dct:created
* dct:issued


In reading the DQV document, I have the following comments:

1. the relationship between metric, dimension, and category is vague and
unclear. The example in 6.1 does not help, since the values are assigned
between 0-1, and the categories are unspecific.  Where did these values
come from, and what are appropriate categories?

can it be the case that the value of a measure is a categorical description?

2. i find the example on quality assessment of a language subset dubious.
I'd recommend keeping the structure as simple as possible, and just
pointing to the object that is being analyzed.
i) type
ii) target dataset
iii) metric
iv) value

therefore, the linkset, with its provenance and features is the target
dataset



The DQV work is also relevant to another initiative that I am involved in -
we are promoting the idea of FAIR (Findable, Accessible, Interoperable, and
Reusable) Data [1]. Have a look and let me know what you think.
[1]  https://www.force11.org/group/fairgroup/fairprinciples


m.


Michel Dumontier, PhD
Associate Professor of Medicine (Biomedical Informatics)
Stanford University
http://dumontierlab.com

On Fri, Dec 4, 2015 at 11:13 AM, Antoine Isaac <aisaac@few.vu.nl> wrote:

> Dear Michel,
>
> Excellent!
> As of now you are very welcome to look at the current state of DQV and
> flag to us if there are specific area we should focus on.
> If you don't have time then we will have a general look at the HCLS
> profile, but it will probably be in the new year as we are lacking the time
> before our next publication. Right now the only thing that the editors can
> do is to flag the issue to the community!
>
> Best,
>
> Antoine
>
> On 12/4/15 6:18 PM, Michel Dumontier wrote:
>
>> Hi Antoine,
>>    Great! I'm happy to discuss.
>> m.
>>
>> Michel Dumontier, PhD
>> Associate Professor of Medicine (Biomedical Informatics)
>> Stanford University
>> http://dumontierlab.com
>>
>> On Fri, Dec 4, 2015 at 7:42 AM, Antoine Isaac <aisaac@few.vu.nl <mailto:
>> aisaac@few.vu.nl>> wrote:
>>
>>     Dear Michel,
>>
>>     In today's Data on the Web Best Practices WG call, we've raised the
>> issue to assess whether we need to align our vocabularies (Data Quality and
>> Data Usage) with the HCLS dataset description profile:
>>     https://www.w3.org/2013/dwbp/track/issues/221
>>
>>     And I have an action to discuss this with you :-)
>>     Is it something you could contribute to, as one of the author of the
>> HCLS profile?
>>
>>     Best,
>>
>>     Antoine
>>
>>     [1] https://www.w3.org/2013/dwbp/track/actions/223
>>
>>
>>

Received on Saturday, 5 December 2015 00:21:40 UTC