Re: Data Quality Vocab for SDW

Hi, Antoine.

I can contribute a use case concerning geospatial metadata.

One of the information that is typically included concerns the spatial 
resolution of a dataset. This is expressed either by a distance - e.g., 
data have a 1km resolution - or with an equivalent scale (i.e., a 
fraction) - e.g., 1:1,000,000.

I include below two XML code snippets to show how this is expressed in 
ISO 19115:


Spatial resolution as distance (1,000 m):

<gmd:spatialResolution>
   <gmd:MD_Resolution>
     <gmd:distance>
       <gco:Distance 
uom="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/uom/ML_gmxUom.xml#m">1000</gco:Distance>
     </gmd:distance>
   </gmd:MD_Resolution>
</gmd:spatialResolution>



Spatial resolution as equivalent scale (1:1,000,000):

<gmd:spatialResolution>
   <gmd:MD_Resolution>
     <gmd:equivalentScale>
       <gmd:MD_RepresentativeFraction>
         <gmd:denominator>
           <gco:Integer>1000000</gco:Integer>
         </gmd:denominator>
       </gmd:MD_RepresentativeFraction>
     </gmd:equivalentScale>
   </gmd:MD_Resolution>
</gmd:spatialResolution>


Property dct:conformsTo (or a specific subproperty to be defined) can be 
used to specify the spatial resolution of a dataset / distribution, but 
three things are missing:

1. How to model the notion itself of (spatial) resolution.

2. How to express in RDF quantity values (e.g., 1m, 2km, 3s, 4h, 5l) and 
fractions.

3. How to glue #1 and #2

Actually, solutions exist to address point #2 - as the QUDT vocabulary 
[1] mentioned during our first joint call [2]. But, to the best of my 
knowledge, there's currently no best practice on how to use them.

This situation is also the reason why in GeoDCAT-AP the decision taken 
was to dump spatial resolution into a free-text field - a provisional 
"mapping" meant to be replaced in the future with a more appropriate 
approach.


So, looking at DQV, I wonder whether dqv:QualityMeasure (and the related 
properties and classes) are generic enough to model also this 
information. E.g. (just trying):


a:Dataset dqv:hasQualityMeasure [ a dqv:QualityMeasure ;
   dqv:hasMetric :spatialResolutionAsEquivalentScale ;
   dqv:value "0.000001"^^xsd:decimal ] .


another:Dataset dqv:hasQualityMeasure [ a dqv:QualityMeasure ;
   dqv:hasMetric :spatialResolutionAsDistanceInMetres ;
   dqv:value "1000"^^xsd:decimal ] .


Not sure this is correct. In particular, it is unclear to me whether 
this is the correct way (in DQV) of modelling the notions of resolution, 
distance / equivalent scale, and units of measurement. In the examples 
above, they are all merged together in one instance of dqv:Metric - 
which, besides resulting in a strange N-headed beast (formally 
speaking), is not scalable.


A final (general) note:

In my understanding, spatial (as well as temporal) resolution can be 
considered as a specific type of data granularity. From this 
perspective, and in order to ensure consistency and interoperability, it 
would be desirable to have a DQV-based approach to model the general 
notion of granularity, that could then be used as a basis for specific 
types (as spatial / temporal resolution).


Cheers,

Andrea

----
[1]http://www.qudt.org/
[2]https://www.w3.org/2016/02/17-sdw-minutes


On 07/03/2016 08:08, Antoine Isaac wrote:
> Dear Phil, Linda,
>
> Thanks a lot for this. This is in fact quite an important requirement;
> I've flagged it as an issue at
> https://www.w3.org/2013/dwbp/track/issues/243
>
> It may however take some time to come back to you, as we still have many
> issues. Actually we had granularity in scope, when we started with DQV.
> But this was downplayed as the DWBP requirements were very vague then.
> Do you have some precise examples from SDW, i.e. showing what data would
> look like, and its problems?
>
> Best,
>
> Antoine
>
> On 3/3/16 10:46 AM, Phil Archer wrote:
>> Antoine, Riccardo,
>>
>> As Antoine will recall, the Spatial Data WG, here represented by
>> Linda, has a particular interest in the DQV. An issue that comes up a
>> lot in spatial datasets is that of precision and accuracy (the fact
>> that Magna Carta was signed in 1215 is accurate, just not very
>> precise, saying it was signed at 1215-06-15T00:00:00 is precise but
>> inaccurate). It occurs in general datasets too but it's particularly
>> acute for spatial.
>>
>> On last night's SDW call, I was asked to put you in touch with linda
>> specifically to talk about this, in particular, how you might express
>> these ideas in the DQV?
>>
>>
>> Process note: I'm archiving this in the SDW's public comment list to
>> avoid having to sign you all up to yet another mailing list.
>>
>> For tracker this is ACTION-149
>>
>

-- 
Andrea Perego, Ph.D.
Scientific / Technical Project Officer
European Commission DG JRC
Institute for Environment & Sustainability
Unit H06 - Digital Earth & Reference Data
Via E. Fermi, 2749 - TP 262
21027 Ispra VA, Italy

https://ec.europa.eu/jrc/

Received on Wednesday, 9 March 2016 10:23:37 UTC