Re: [sdw] modeling units on properties instead of results [SOSA/SSN] (#1267)

I fear more than one issue is being conflated here. I'd like to focus on the one that started the thread. 

My understanding of the motivation for this thread is the observation that, for many or most datasets, stored in the traditional forms of databases and spreadsheets, the unit-of-measure is included in the 'column heading' for a list of values, thus apparently binding the unit-of-measure to the observed-property. OTOH, in O&M and SOSA the unit-of-measure has more often been bound to the quantity-value - i.e. the 'cell' rather than the 'column'. While this is conceptually OK (1" is the same value as 25.4mm) it looks inefficient and does not seem to match normal practice. 

In a concrete example: a 'direct serialization' using SOSA seems to require 

```turtle
# Example 1
my:Observation99 a sosa:Observation ;
    sosa:observedProperty <PbConcentration> ;
    sosa:hasSimpleResult "3.2 [ppm]"cdt:ucum . 
```

while a more direct representation of a "cell from a typical spreadsheet" would be more like 

```turtle
# Example 2
my:Observation99 a sosa:Observation ;
    sosa:observedProperty <PbConcentrationInPPM> ;
    sosa:hasSimpleResult "3.2"xsd:decimal . 
```

where *PbConcentrationInPPM* essentially implements the heading from the 'lead' column (details of the ObservableProperty formulation not shown). 

So the proposal is to 'allow' the unit-of-measure to be found in other places, _at least in the serialization_. For example in the description of an observed-property as shown in Example 2. Or maybe just associated directly with an observation like this

```turtle
# Example 3
my:Observation99 a sosa:Observation ;
    sosa:observedProperty <PbConcentration> ;
    sosa:hasSimpleResult "3.2"xsd:decimal ; 
    qudt:unit <http://qudt.org/vocab/unit/PPM> .
```

Both Example 2 and 3 seem to offend some rather extreme/purist interpretation of the SOSA model. 

However, it is pretty easy to write rules or axioms to move the key information from wherever you find it into your preferred slot. And while a conceptual purist might insist that the unit-of-measure is only associated with the value of the result, the pragmatist would recognise that there may be temporary efficiencies gained by putting it in another place, and also it would be less surprising to typical scientists and data managers. 

My suggestion is to allow all of these forms, but to push back onto the domains and communities the responsibility to document their preferred pattern or practice (and maybe provide code to transform between them). 

Of course the issue only arises in the first place if you think of the RDF representation as some kind of persistent artefact, rather than just something that is built on-the-fly from whatever datastore you are accessing. But we've all been trapped there sometimes ;-) 

-- 
GitHub Notification of comment by dr-shorthair
Please view or discuss this issue at https://github.com/w3c/sdw/issues/1267#issuecomment-869427245 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Monday, 28 June 2021 07:22:13 UTC