W3C home > Mailing lists > Public > public-sdw-wg@w3.org > July 2016

Re: Units of Measure (BP, Coverages, SSN,Time?)

From: Rob Atkinson <rob@metalinkage.com.au>
Date: Sat, 02 Jul 2016 03:31:47 +0000
Message-ID: <CACfF9Ly-dHOFc50tDwpeB1rQsrJTiG1i-M1aDyfvmrxi3Cf1Hw@mail.gmail.com>
To: Jon Blower <j.d.blower@reading.ac.uk>, Rob Atkinson <rob@metalinkage.com.au>, "Simon.Cox@csiro.au" <Simon.Cox@csiro.au>, "m.riechert@reading.ac.uk" <m.riechert@reading.ac.uk>, "public-sdw-wg@w3.org" <public-sdw-wg@w3.org>
Hi Jon

The encoding scheme issue raises a duality between class and instance - any
UoM could be expressed as as either an instance (with SKOS encoding as a
natural default) or a Class - RDFS or OWL being the default options. In
addition a meta-model of UoM could be defined in RDFS or OWL and used to
drive encodings of instances.

Personally, I think that in the Web we should specify that a URI is used if
one is available - and that an encoding of its details may be used as
annotation. In the case of an "anonymous" UoM, then the encoding will still
probably need to reference base units using URIs.

The wrinkles are whether URIs are explicit, or encoded as items in a
namespace - and whether any encoding scheme (model) may be used or one is
recommended, and if the model itself needs to be explicitly referenced
(presumably this applies to JSON-LD, RDFA etc as RDF will always use URIs
to specify the model elements anyways.

A worked example set with:
1) just URI from a well-known vocabulary (UCUM)
2) A encoded UoM with one URI, and a simple label
3) ditto, with a more complex set of details
4) ditto with more that one URI (e.g. UCUM and QUDT)
5) a blank/anonymous encoded UoM with base measures.

Would we go so far as to recommend QUDT as the meta-model (as per example
provided?) - or simply list a few in use and provide a couple of examples?

This will cover the "follow-your-nose" cases - however there is the case of
a data encoding where the UoM is specified in metadata. The question here
then is defining a BP for this metadata.
One option - we can use RDF-QB to define data structures and relevant UoM.
I'm not sure there is an obvious alternative to ad-hoc metadata models and
UoM specified any non-interoperable way that emerges.

This option then speaks directly to the coverages metadata perspective
(encoding of data using RD-QB becomes a trivial case - we simply state that
if RDF encoding, then BP would be to use RDF-QB encoding consistent with
the RDF-QB metadata for the set, and the interesting and more generally
useful case is describing an existing or compact encoding usefully)

Rob

On Sat, 2 Jul 2016 at 02:20 Jon Blower <j.d.blower@reading.ac.uk> wrote:

> Hi Rob – yes, I think those are the missing bits, but, just to reiterate,
> it may not be (just) a “vocabulary” that we need (in the sense of a set of
> URIs), but a serialisation scheme for any unit.
>
>
>
> For concrete examples, we should look at where we need to use units. I
> think we have:
>
>
>
> 1.       As part of coordinate systems and coordinate reference systems
>
> 2.       As part of measured quantities (e.g. the range of a coverage),
> linked to observed properties etc
>
> 3.       …
>
>
>
> My last paragraph wasn’t very clear, sorry. I was trying to say that the
> different uses (coordinate systems, observed properties) might actually
> have different best practices in terms of the encoding of their units. We
> could feasibly decide that coordinate system units are best expressed as
> URIs, but the units of observed properties are better expressed as strings
> in a named serialisation scheme (like UCUM). Maybe, I don’t know – just
> raising the possibility.
>
>
>
> Cheers,
> Jon
>
>
>
>
>
> *From: *Rob Atkinson <rob@metalinkage.com.au>
> *Date: *Friday, 1 July 2016 14:39
> *To: *Jon Blower <sgs02jdb@reading.ac.uk>, Rob Atkinson <
> rob@metalinkage.com.au>, "Simon.Cox@csiro.au" <Simon.Cox@csiro.au>, Maik
> Riechert <m.riechert@reading.ac.uk>, "public-sdw-wg@w3.org" <
> public-sdw-wg@w3.org>
>
>
> *Subject: *Re: Units of Measure (BP, Coverages, SSN,Time?)
>
>
>
> This is the type of recommendation i think we need. Lets refine... the
> missing bits are:
> 1 guidance on what vocabulary.. even noting that different communities use
> different ones and naming them is a help.
> 2 provision of mappings if you want to interoperate across community
> choice here.. do you embed multiple uris, or provide sone sort of sameAs
> service?
> 3 concrete examples
>
> I dont quite follow the final paragraph and the implications for what the
> encoding would look like?
>
> Rob
>
>
>
> On Fri, 1 Jul 2016 11:12 am Jon Blower <j.d.blower@reading.ac.uk> wrote:
>
> Just to add a little to this – units of measure are very tricky in
> general. The overall requirement, I think, is to have an unambiguous
> serialisation scheme for units, including both base units (the easy cases)
> and the infinite number of derived units (the hard cases) – that is to say,
> a spec for serialising units to ASCII strings. This allows clients to
> convert between units, which is a primary use case for having “strongly
> typed” units.
>
>
>
> In terms of serialisations, I’m aware of UCUM and UDUNITS (the latter is
> used extensively in climate/met/ocean and is connected with CF). I don’t
> think either are perfect in terms of governance, and I’m not even sure that
> UDUNITS has a formal spec.
>
>
>
> Then there are URIs. QUDT has URIs for a lot of base and derived units,
> but it can’t possibly have them all, hence the need for a scheme that
> allows any unit to be serialised. So there will always be gaps, but I note
> that QUDT covers a lot of the common cases I can think of – so it’s not
> clear to me how important the gaps are.
>
>
>
> Typical clients will just want to display the symbol for the unit, so we
> should make sure that, if we use URIs, we also transmit the symbol, as I
> doubt that a typical web client will want to resolve the URI and look up
> the symbol. This is effectively what Maik is doing, by transmitting the
> symbol plus a URI for the unit **scheme** rather than a URI for the unit
> itself.
>
>
>
> (Question – does QUDT use UCUM as a means of generating the unit symbol?)
>
>
>
> There are a few tricky cases in science – e.g. salinity, which strictly
> has no units and is a very weird kind of quantity – and sometimes these
> tricky cases lead to poor practice in real data files – i.e. expressing
> units incorrectly or inconsistently. (and of course, poor practice can
> happen in real-world data files anywhere).
>
>
>
> I think an overall BP recommendation would be:
>
>
>
> 1.       Express units unambiguously if possible, using a named unit
> serialisation scheme or URI.
>
> 2.       Give the unit symbol, and perhaps a longer explanatory text
> string (e.g. a rdfs:label), to help simple clients understand the unit,
> even if they don’t want to resolve the full unit description.
>
> 3.       Also allow users to record “ad hoc” unit strings for fallback
> cases that don’t fit well with existing serialisation or URI schemes,
> making it clear that these are not really machine-understandable
>
>
>
> There may be cases where we can refine this further depending on the use
> case. For example, in CRS definitions, which tend to use simple units, it’s
> probably desirable to use well-known URIs to represent units. For recording
> the units of a measured quantity (e.g. the range of the coverage), I like
> methods like the one Maik suggested, as this maps more neatly to common
> practice in my community.
>
>
>
> Cheers,
>
> Jon
>
>
>
>
>
> *From: *Rob Atkinson <rob@metalinkage.com.au>
> *Date: *Friday, 1 July 2016 08:46
> *To: *"Simon.Cox@csiro.au" <Simon.Cox@csiro.au>, "rob@metalinkage.com.au"
> <rob@metalinkage.com.au>, Maik Riechert <m.riechert@reading.ac.uk>, "
> public-sdw-wg@w3.org" <public-sdw-wg@w3.org>
>
>
> *Subject: *Re: Units of Measure (BP, Coverages, SSN,Time?)
>
> *Resent-From: *<public-sdw-wg@w3.org>
> *Resent-Date: *Friday, 1 July 2016 08:47
>
>
>
> Perfect Simon - thanks.
>
> Its not that obvious trawling the docs what the pragmatic aspects are.
>
>
>
> So I would suggest then that a BP endorsed by OGC would have a minimum
> requirement that a mapping to UCUM is provided for any vocabulary used for
> UoM, to provide for compatibility with existing recommendations (can we
> call these BP?)
>
>
>
> If it helps I could set up a OGC resource for UCUM - with redirects for
> specific terms - instead of to the containing spec (thats the way UCUM
> works) - or to a SKOS resource with skos:exactMatch relationships to the
> UCUM terms.  I can also deploy a crosswalk to UCUM from another UoM vocab
> if we decide to recommend it.
>
>
>
> The onoging governance of such a resource in the context of the BP can be
> taken up as a action from the SDW to the OGC (what is the appropriate point
> of contact here? NA, OAB, TC, PC?)
>
>
>
> Rob
>
>
>
> On Fri, 1 Jul 2016 at 16:10 <Simon.Cox@csiro.au> wrote:
>
> Ø  If OGC has adopted UCUM as a BP (can someone make a definitive
> statement on this …
>
>
>
> OGC’s endorsement of UCUM comes from
>
> 1.      It is recommended in WMS [1]
>
> 2.      Ditto GML [2]
>
> 3.      There is a branch of the www.opengis.net/def/ URI set for UCUM -
> http://www.opengis.net/def/uom/UCUM/ but just redirects to the UCUM spec
> [3]
>
>
>
> But that is purely pragmatic, as it seemed to be the best thing around at
> the time.
>
> It has a fragile governance arrangement, and URIs are not
> de-referenceable.
>
>
>
> [1] http://www.opengeospatial.org/standards/wms version 1.3 clause C.2.
>
> [2] http://www.opengeospatial.org/standards/gml v3.2.1 clause 8.2.3.6
>
> [3] http://unitsofmeasure.org/ucum.html
>
>
>
> *From:* Rob Atkinson [mailto:rob@metalinkage.com.au]
> *Sent:* Friday, 1 July 2016 1:46 AM
> *To:* Maik Riechert <m.riechert@reading.ac.uk>; Rob Atkinson <
> rob@metalinkage.com.au>; SDW WG Public List <public-sdw-wg@w3.org>
> *Subject:* Re: Units of Measure (BP, Coverages, SSN,Time?)
>
>
>
> Thanks Maik,
>
>
>
> If i read this right, this example assumes the client understands qudt -
> then uses the semantics of qudt:symbol to map instances (Cel)  in another
> namespace to this.  UCUM uses
> http://purl.oclc.org/NET/muo/ucum/unit/temperature/degree-Celsius as the
> id - but the information to map to that is not present. Is "Cel" just a
> dummy example - would you actually want to say "degree-Celsius" - and in
> turn want the OGC redirect to respect that and redirect
>
> http://www.opengis.net/def/uom/UCUM/degree-Celsius to
> http://purl.oclc.org/NET/muo/ucum/unit/temperature/degree-Celsius?
>
>
>
> What about the original assumption of using QUDT - why not use UCUM or
> another in the first instance. Coming from the outside and trying to
> identify a best practice, what exactly is this example saying?
>
>
>
> If OGC has adopted UCUM as a BP (can someone make a definitive statement
> on this - it should be present in the BP when we talk about vocabulary
> re-use - a list of vocabularies in use in the OGC space) then we should
> start with that perhaps? If we are saying the BP requirement is to allow an
> emerging body of QUDT usage to interoperate then we need perhaps to
> recommend publishing the mappings as a resource - whatever we think is BP
> we need to communicate clearly to the average user who wont have years of
> exposure to the history and details to draw on - and will most likely
> simply want to maximise interoperability of a few cases.
>
>
>
> Cheers
>
> Rob
>
>
>
> On Fri, 1 Jul 2016 at 01:00 Maik Riechert <m.riechert@reading.ac.uk>
> wrote:
>
> Hi Rob,
>
> I just wanted to throw in a slightly different/complementary view on this.
>
> While it is useful to have URIs for any kind of unit, I think it is even
> more useful to have a symbolic coding in a certain coding scheme for those
> units, because then clients with support for that scheme can easily parse
> the unit, and transform it and the associated numbers. One scheme example
> is UCUM (http://unitsofmeasure.org/ucum.html). OGC gave it a URI as well:
> http://www.opengis.net/def/uom/UCUM/
>
> In my opinion you would have something like that (JSON-LD):
>
> {
>   "@context": {
>     "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> <http://www.w3.org/1999/02/22-rdf-syntax-ns>,
>     "qudt": "http://qudt.org/schema/qudt#" <http://qudt.org/schema/qudt>,
>     "skos": "http://www.w3.org/2004/02/skos/core#"
> <http://www.w3.org/2004/02/skos/core>
>   },
>   "rdf:value": 27.5, // for example purposes only
>   "qudt:unit": {
>     "@id": "qudt:DegreeCelsius",
>     "skos:prefLabel": { "en": "Degree Celsius" },
>     "qudt:symbol": {
>       "@type": "http://www.opengis.net/def/uom/UCUM/"
> <http://www.opengis.net/def/uom/UCUM/>,
>       "@value": "Cel"
>     }
>   }
> }
>
> So the main point is that the value of "qudt:symbol" has a custom data
> type, in this case http://www.opengis.net/def/uom/UCUM/.
>
> Cheers
>
>
> Maik
>
>
>
> Am 30.06.2016 um 15:14 schrieb Rob Atkinson:
>
> Hi,
>
>
>
> I'm looking into the BP aspects around defining data dimensions as a
> framework for evaluating and contributing to various SDW threads. One which
> seems to cut across, but I havent seen an explicit treatment of the UoM
> problem. I know I may have missed previous conversatiosn - but I dont see
> any treatment in the current reviewable docs.
>
>
>
> Specifically, if I was to follow the W3C Data on the Web Best Practices I
> would be led via BP #2
>
>
>
> "To express frequency of update an instance from the Content-Oriented
> Guidelines developed as part of the W3C Data Cube Vocabulary efforts was
> used."
>
>
>
> to this statement:
>
> "To express the value of this attribute we would typically use a common
> thesaurus of units of measure. For the sake of this simple example we will
> use the DBpedia resource http://dbpedia.org/resource/Year which
> corresponds to the topic of the Wikipedia page on "Years".
>
>
>
> If we have a Time ontology - surely we would be pointing to that as a
> recommendation for temporal units of measure.
>
> Likewise, i would have thought that OGC would have an interest in binding
> CRS with their in built units of measure to spatial dimensions.
>
> One could argue that without interoperability at this level there is a
> question why the OGC would have any involvement in Web standards - but if
> there is a counter-argument then I feel this needs to be front-and-centre
> of the BP to explain to a potential user what they can expect, and where
> they are going to be left with making all the significant decisions.
>
>
>
> If we have Time and CRS UoM, then we may be able to get away with not
> specifiying a vocabulary for other UoM for measurements. Are there any
> obvious dimensions that need UoM vocabularies?
>
>
>
> When I specify O&M profiles, (my driving use case), I'll need to specify
> the UoM for measurements - is there any recommendation regarding which
> vocabulary to choose?   And for CRS based dimensions?
>
>
>
> Rob Atkinson
>
>
>
>
>
>
>
>
Received on Saturday, 2 July 2016 03:32:34 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:31:23 UTC