AW: AW: [QB] Last Call document draft from Benedikt Kaempgen on 2013-03-08 (public-gld-wg@w3.org from March 2013)

From: Benedikt Kaempgen <kaempgen@fzi.de>
Date: Fri, 8 Mar 2013 11:14:13 +0000
To: Dave Reynolds <dave.e.reynolds@gmail.com>
CC: Government Linked Data Working Group <public-gld-wg@w3.org>
Message-ID: <0D7BFFD7C415144DA75C3D49C46AC21512AB3430@ex-ms-1a.fzi.de>
Hi,

Thanks for considering my feedback.

> time intervals from the UK reference time service in the running
> example. This is not a change to the spec, it has been like that all
> the
> time  and we haven't changed that phrasing.
It just was not clear to me that a range could replace a code list. I thought range would just be an additional information source for applications whereas a coded property would be defined as definitely having a code list. Not sure whether that could be made clearer.

> Due you mean we should point out explicitly that all measures are
> required, whichever approach you use? Or are you saying that measures
> should be optional in one approach or the other?
I think we should point out that all measures are required, whichever approach you use.

> I've rephrased that line instead. I want to avoid getting into the
> status of the SDMX-in-RDF extension!
OK.

> Ah, good point. Of course that section was originally written before
> DCAT. I'm not sufficiently familiar with DCAT to suggest what to put
> here. DCAT seems like it would be more used to describe a catalogue
> entry that would reference a Data Cube rather than a Data Cube
> itself.
> 
> Richard - if you feel there is a useful reference to DCAT to be made
> here I'm happy for you to make that change and will trust your
> judgement
> on it.
Richard told me recently that in his opinion, a qb:DataSet is a dcat:Dataset. If so, we should either say so formally in the RDF and/or informally in prose.

> there is no declaration then it is only attributes that trigger that
> default behaviour.
Clear, now.

> issue with subslices, but also possibly for SDMX 2.1. It seems
> reasonable to leave it like that at this stage since the integrity
> constraints make the relevant usage reasonably clear.
OK.

Best,

Benedikt
________________________________________
Von: Dave Reynolds [dave.e.reynolds@gmail.com]
Gesendet: Donnerstag, 7. März 2013 19:51
An: Benedikt Kaempgen
Cc: Government Linked Data Working Group
Betreff: Re: AW: [QB] Last Call document draft

Thanks Benedikt,

I have addressed your editorial issues. Here's responses on the others:

 > * 7.1: "To cater for this a component can also be optionally
annotated with a qb:codeList to indicate a set of skos:Concepts which
may be used as codes." (do we allow to have non-coded DimensionProperties?)

A dimension may specify its values using rdfs:range in order to be able
to use instances of some class as the codes. For example, see the use of
time intervals from the UK reference time service in the running
example. This is not a change to the spec, it has been like that all the
time  and we haven't changed that phrasing.

 > * 7.5.1: "Note that one limitation of the multi-measure approach is
that it is not possible to attach an attribute to a single observed
value." (Is there not also the following limitation: Since measures
defined in data structure definitions are always "required", all
multiple measures need to be set in every observation. If IC-17 does say
otherwise, we should explain it here.)

Sorry, not sure I follow what you are saying here.

For both approaches to multiple measures then all measures are required.
IC-17 says that for Measure dimension cubes, just as IC-14 says it for
multi-measure observations. So that's not an advantage of one approach
or the other.

Whereas the section you quote is one reason to use the Measure dimension
approach instead of multi-measures.

Due you mean we should point out explicitly that all measures are
required, whichever approach you use? Or are you saying that measures
should be optional in one approach or the other?

 > * 7.5.2: "SMDX-in-RDF extension vocabulary" (reference is missing)

I've rephrased that line instead. I want to avoid getting into the
status of the SDMX-in-RDF extension!

 > * 11: No mentioning of DCAT?

Ah, good point. Of course that section was originally written before
DCAT. I'm not sufficiently familiar with DCAT to suggest what to put
here. DCAT seems like it would be more used to describe a catalogue
entry that would reference a Data Cube rather than a Data Cube itself.

Richard - if you feel there is a useful reference to DCAT to be made
here I'm happy for you to make that change and will trust your judgement
on it.

 > * IC-6: "The only components of a qb:DataStructureDefinition that may
be marked as optional, using qb:componentRequired are attributes. "
(Since in 7.4 we say "In the absence of such a declaration an attribute
is assumed to be optional.", will this test work?)

Yes. If there is explicit markup the query will find it and check that
it is being applied to an attribute. If there is no explicit markup then
there is no error to find and the query will not find one. After all if
there is no declaration then it is only attributes that trigger that
default behaviour.

 > * 14.7: "might an qb:DataSet, qb:Slice or qb:Observation, or a
qb:MeasureProperty" (1) verb "be" is missing 2) Why is this union not
formalised in the range?)

Originally we wanted to leave the range rather open to allow for other
attachment levels in the future. Especially when/if it came to resolving
issue with subslices, but also possibly for SDMX 2.1. It seems
reasonable to leave it like that at this stage since the integrity
constraints make the relevant usage reasonably clear.

Changes checked in and updated pubrules-compliant static.html generated.

Cheers,
Dave


On 07/03/13 17:19, Benedikt Kaempgen wrote:
> Hi,
>
> Impressive work. I have one larger comment and a list of smaller comments to the new QB spec version.
>
> Larger feedback:
>
> ==10.2 Hierarchical code lists==
> * For OLAP on QB [1], I use hierarchy levels of increasing detail as first class citizens in the data, e.g., to give them a name such as "Product Brand". Therefore, I am using XKOS for representing hierarchy levels (xkos:ClassificationLevel, subclass of skos:Collection) [2].
> * Here, I would still use skos:narrower to define relationships between skos:Concepts.
> * However, I would connect those concepts via skos:member to separate xkos:ClassificationLevels, each having a xkos:depth, and would then connect those xkos:ClassificationLevels via skos:inScheme to the code list.
> * Also, I would not declare skos:topConcept but rather define the top concepts via an upmost level (depth 0 or 1).
> * I want to make sure that I still comply with the vocabulary and do "not use terms from other vocabularies instead of ones defined in this vocabulary that could reasonably be used".
> * Thus, my question: Can publishers still reuse XKOS together with QB? I think, we should allow and clarify this. For instance, this could mean to allow that skos:Collections are connected to a code list instead of using skos:hasTopConcept.
>
> [1] <http://code.google.com/p/olap4ld/>
> [2] <https://github.com/linked-statistics/xkos>
>
> Minor feedback:
>
> * 2.1: "using the the W3C" (double the)
> * 5.2: Maybe call this section "Introducing slices" or similar to distinguish it from Section 9
> * 7.1: "In particular, is sometimes" (should be "it is sometimes")
> * 7.1: "To cater for this a component can also be optionally annotated with a qb:codeList to indicate a set of skos:Concepts which may be used as codes." (do we allow to have non-coded DimensionProperties?)
> * 7.2: "has developed, and maintains, RDF encodings of these guidelines" (rephrase: "has developed that maintains RDF encodings of these guidelines")
> * 7.3: Explicitly mention prefix "interval" and "admingeo" in prose before having it used in the turtle (interval:Interval and admingeo:UnitaryAuthority). Have a reference to the data.gov.uk reference time service.
> * 7.4: "every observation then the specification should set ." (incomplete sentence, should finish with "qb:coponentRequired", probably)
> * 7.4: "The qb:componentRequired> declaration" (the ">" is not needed)
> * 7.4: "It can also be useful in the publication chain to enable synthesis of appropriate URIs for observations." (unclear, you mean qb:order as a kind of code for a dimension?)
> * 7.5.1: "Note that one limitation of the multi-measure approach is that it is not possible to attach an attribute to a single observed value." (Is there not also the following limitation: Since measures defined in data structure definitions are always "required", all multiple measures need to be set in every observation. If IC-17 does say otherwise, we should explain it here.)
> * 7.5.2: "SMDX-in-RDF extension vocabulary" (reference is missing)
> * 7.5.2: "which subsumes each of the individual measures, those individual measures are used directly." (difficult to understand, please rephrase)
> * 9: Section structure could be better balanced, here. How about having "8.2 Slices and general groups of observations" instead of "9. Slices"?
> * 10.1: "Explicitly declaring the code list using qb:codeList is not mandatory" (What exactly does that mean? Using qb:codeList is not mandatory in general or in case an rdfs:range is set? Put differently, can we give a ranking of how to best model coded values: 1) Use qb:codeList and rdfs:range 2) Use qb:codeList 3) use rdfs:range?)
> * 10.2: "code lists lists should" (double lists)
> * 10.3: "or then they form a complete cover of the parent concept" (should be "when they")
> * 10.3: "The Data Cube vocabulary supports this situation through the qb:HierarchicalCodeList class. An instance of qb:HierarchicalCodeList defines a set of root concepts in the hierarchy (qb:hierarchyRoot) and a parent-to-child relationship (qb:parentChildProperty)." (Do I understand correctly, that qb:HierarchicalCodeList is the non-SKOS pendant to skos:ConceptScheme, qb:hierarchyRoot to skos:hasTopConcept and the property defined by qb:parentChildProperty to skos:narrower. If so, I think we should clarify that. Also, I think, we should back up "parent-to-child relationship" with a reference of what it means.
> * 10.3 - example 16: "qb:hierarchy Root" (should be qb:hierarchyRoot)
> * 11: No mentioning of DCAT?
> * 11.1: "where eg:Wales is a skos:Concept drawn" (eg:Wales not found in example 18)
> * 12: "For illustration see example 4 in which" (link to example 4 does not work)
> * IC-6: "The only components of a qb:DataStructureDefinition that may be marked as optional, using qb:componentRequired are attributes. " (Since in 7.4 we say "In the absence of such a declaration an attribute is assumed to be optional.", will this test work?)
> * 14.7: "might an qb:DataSet, qb:Slice or qb:Observation, or a qb:MeasureProperty" (1) verb "be" is missing 2) Why is this union not formalised in the range?)
> * 14.7: "component is a attribute." ("an attribute")
> * 14.6: "Defines/indicates" (editorial comment: We sometimes start with an upper case letter, sometimes with a lower case letter. Should be consistent)
>
> Best,
>
> Benedikt
>
> ________________________________________
> Von: Dave Reynolds [dave.e.reynolds@gmail.com]
> Gesendet: Dienstag, 5. März 2013 15:28
> An: Government Linked Data Working Group
> Betreff: [QB] Last Call document draft
>
> I've released what I hope is a reasonable initial candidate for the Last
> Call WD for the Data Cube vocabulary. This is preparation for the vote
> on Thursday.
>
> https://dvcs.w3.org/hg/gld/raw-file/default/data-cube/index.html
>
> I'll send a separate note around on where we are with the various
> issues. However, I think this draft resolves each of the issues in the
> way discussed in the separate email threads.
>
> Richard please check this. Feel free to fix minor problems or ask me to.
> Major problems should probably be raised on the list.
>
> Benedikt - thanks for volunteering to do a review check. Please let us
> know of any problems that you spot.
>
> The most substantial change from the last WD is the section on criteria
> for well-formed Data Cubes (ISSUE-29 [1]). The criteria discussions have
> been on the list. The SPARQL queries which are provided (to back up the
> narrative descriptions of the criteria) have all been checked on at
> least some positive and negative examples. The code for this is in the
> same repository where the vocabulary source sits [2].
>
> I have one more task before putting down the edit token, which is to
> find a way to have a "Contributors" section to list Jeni, rather than
> leave here in the Acknowledgements.
>
> Dave
>
> [1] http://www.w3.org/2011/gld/track/issues/29
>
> [2] https://code.google.com/p/publishing-statistical-data/
>
Received on Friday, 8 March 2013 11:14:39 UTC