- From: Rob Atkinson <rob@metalinkage.com.au>
- Date: Thu, 07 Jul 2016 21:56:24 +0000
- To: Linda van den Brink <l.vandenbrink@geonovum.nl>, Rob Atkinson <rob@metalinkage.com.au>, Jon Blower <j.d.blower@reading.ac.uk>, "Simon.Cox@csiro.au" <Simon.Cox@csiro.au>, "m.riechert@reading.ac.uk" <m.riechert@reading.ac.uk>, "public-sdw-wg@w3.org" <public-sdw-wg@w3.org>
- Message-ID: <CACfF9LxsoVuDDT2ou3Ncm2tsKPHWU+dRDQkttD7Wp9_jdVcbBg@mail.gmail.com>
I'll put the conversation into this format. I'll put some placeholders for volunteers to put in worked examples of what they think are BP implementations and important and illustrative exemplar cases. I think however that this is another example where no practice could be recommended that does not include model/profile negotiation (distinct from content-negotation which has been given a very narrow scope). The reason is that there is no perfect, well governed and agreed model or list of possible units (two separate requirements) and that both need to co-exist - so any practice has to build in the mechanism to either migrate to an emerging standard or to allow support for multiple competing solutions. Or put it another way - all the incredibly hard problems around different UoM systems and finding a BP recomendation are simplified by a BP that allows for content models. If we are going to have a general statement about this in the wider BP, the UoM case can reference it. We dont need to overspecify the mechanism here - but warning people that such a capability is a longer term requirement can usefully guide implementation. Rob On Fri, 8 Jul 2016 at 00:14 Linda van den Brink <l.vandenbrink@geonovum.nl> wrote: > Hi – just trying to get through the SDW email. > > > > When I apply the template we use in the BP it would be like this: > > > > Name of the BP: **Use a URI identifier for UoM** (or a bit better worded) > > **why** … a problem description I could probably get somewhere from this > thread > > **Intended Outcome** data user can look up the URI and get information > about the UoM > > **possible approach to implementation** recommended representations > include QUDT, SKOS, UCUM, OWL-class?, any standard relevant to the > community of practice. > > > > I would very much appreciate it if starters of threads would make > summaries like the above… > > > > Content negotiation is a neat subject but not specific to spatial.. I > don’t think we should tackle this problem in the BP, or am I missing > something?. > > > > *Van:* Rob Atkinson [mailto:rob@metalinkage.com.au] > *Verzonden:* dinsdag 5 juli 2016 00:33 > *Aan:* Jon Blower; Simon.Cox@csiro.au; rob@metalinkage.com.au; > m.riechert@reading.ac.uk; public-sdw-wg@w3.org > *Onderwerp:* Re: Units of Measure (BP, Coverages, SSN,Time?) > > > > Thanks for the insights Simon. > > > > It will take some care to turn this into a best practice recipe that > doesnt get broken immediately IMHO. > > We can get out of jail from an engineering perspective by saying you > should use a URI identifier for UoM that allows content-negotiation to > access one or more representations. > > Recommended representations include: > > 1) QUDT structural description > > 2) SKOS as a canonical means to describe labels and provide links to > alternative codes > > 3) UCUM specification if relevant for the UoM > > 4) OWL-class ? > > 4) Any representations defined by standards organisations relevant to the > community of practice > > > > (Content negotiation can be driven by MIME-type in headers or by explicit > view parameters - need a separate BP around this that encompasses the UK > and other LDA examples - its a pattern that generally allows us to take on > a de-facto option and migrate to a de jure standard when it evolves - which > we see as the most common pattern just about everywhere. We also either > need to specify a set of views and their corresponding OWL models , or a > way to bind any view to its relevant OWL model in a general way ) > > > > We can further recommend the UCUM URI structure. > > > > If necessary we can deploy such representations - I dont mind taking on > the deploying using the URI redirection machinery I have deployed at > resources.opengeospatial.org. Would prefer someone to provide some > endorsed representations - HTML, JSON-LD, RDF - for QUDT, SKOS and > OWL-class. > > > > Minimum would be for some examples (simple, derived-with UCUM equiv, > derived-without UCUM equiv). A complete set would be just as easy to deploy. > > > > > > > > On Mon, 4 Jul 2016 at 19:23 Jon Blower <j.d.blower@reading.ac.uk> wrote: > > Ø Ideally we would have a reliable set of URIs for UOMs which could > leverage the UCUM algorithm to build the URI, and which would resolve to a > QUDT-based representation of the unit of measure. > > > > +1 > > > > Is it possible to use the UCUM symbol for the UoM the URI suffix? Or are > there problems like character-encoding issues? > > > > Cheers, > Jon > > > > *From: *"Simon.Cox@csiro.au" <Simon.Cox@csiro.au> > *Date: *Monday, 4 July 2016 01:13 > *To: *"rob@metalinkage.com.au" <rob@metalinkage.com.au>, Jon Blower < > sgs02jdb@reading.ac.uk>, Maik Riechert <m.riechert@reading.ac.uk>, " > public-sdw-wg@w3.org" <public-sdw-wg@w3.org> > *Subject: *RE: Units of Measure (BP, Coverages, SSN,Time?) > > > > Lets be clear about what QUDT and UCUM actually offer. > > > > QUDT - > > · primarily provides a model for descriptions of units of > measure, and of quantity-kinds (a.k.a. qualities, or “observable > properties”); the model is formalized using OWL, and thus provides an > RDF-based syntax for description of a uom or a quantity-kind > > · also provides some lists (called ‘vocabularies’) of individual > unit- and quantity-kind- descriptions, but which is very idiosyncratic and > incomplete (includes a whole bunch of currencies!) > > · there are no rules for how the labels or symbols for units are > built in the QUDT vocabularies; they are not aligned with the ISO or SI > standards (e.g. the label for the unit of length is spelled ‘Meter’, and > the symbol for the unit of temperature is ‘degC’), capitalization is > inconsistent, and use of non-asci character set is variable > > · the maintenance arrangements for QUDT are private (TopQuadrant > + NASA) and the publication arrangements are flaky (QUDT v2.0 has been ‘on > the way’ for about 3 years, and even though it is linked the qudt.org > website, it has been 404 for over a year). > > > > UCUM – > > · Focuses on a rule for how to generate a symbol for a ‘derived > uom’ > > · uses a rigorous algorithm based on a theory of quantities and > dimensional analysis, which starts from any base set of units in a rational > system (SI, MKS, cgs, even pounds-feet-seconds if you want!) > > · UCUM provides a base set of symbols corresponding essentially > with SI, plus symbols for the standard power of ten prefixes > (micro/milli/kilo/mega etc). The base set has some fudging to get around > the anomaly that the SI base unit for mass (kg) already has a power-of-ten > prefix built in. > > · The algorithm and base set of symbols is such that symbols > generated following UCUM are aligned with conventional usage, and with ISO > 1000 > > · There is some additional notation using {} and [] to allow for > annotations and ‘conventional’ units, which I always get confused about. > > > > My assessment is that the QUDT Ontology v1.1 is good enough, (I was on an > Ontolog telecon with Pat Hayes, Ralph Hodgson, Gary Berg-Cross a couple of > years ago where that was the clear consensus) but the QUDT vocabularies are > not. So we need another set of URIs denoting uoms, with the expectation > that dereferencing one of these would result in a QUDT-based > representation. > > Ideally we would have a reliable set of URIs for UOMs which could leverage > the UCUM algorithm to build the URI, and which would resolve to a > QUDT-based representation of the unit of measure. These representations > should be built on-the-fly using the UCUM engine. > > > > Note that, using QUDT, a uom description is an OWL _*individual*_ (not a > class), but with complete semantics, still supporting some reasoning. Rob – > going with individuals doesn’t mean you have to us SKOS and certainly > doesn’t lose semantic precision - probably best not to casually suggest > that! > > > > Simon > > > > > > *From:* Rob Atkinson [mailto:rob@metalinkage.com.au] > *Sent:* Saturday, 2 July 2016 1:32 PM > *To:* Jon Blower <j.d.blower@reading.ac.uk>; Rob Atkinson < > rob@metalinkage.com.au>; Cox, Simon (L&W, Clayton) <Simon.Cox@csiro.au>; > m.riechert@reading.ac.uk; public-sdw-wg@w3.org > *Subject:* Re: Units of Measure (BP, Coverages, SSN,Time?) > > > > Hi Jon > > > > The encoding scheme issue raises a duality between class and instance - > any UoM could be expressed as as either an instance (with SKOS encoding as > a natural default) or a Class - RDFS or OWL being the default options. In > addition a meta-model of UoM could be defined in RDFS or OWL and used to > drive encodings of instances. > > > > Personally, I think that in the Web we should specify that a URI is used > if one is available - and that an encoding of its details may be used as > annotation. In the case of an "anonymous" UoM, then the encoding will still > probably need to reference base units using URIs. > > > > The wrinkles are whether URIs are explicit, or encoded as items in a > namespace - and whether any encoding scheme (model) may be used or one is > recommended, and if the model itself needs to be explicitly referenced > (presumably this applies to JSON-LD, RDFA etc as RDF will always use URIs > to specify the model elements anyways. > > > > A worked example set with: > > 1) just URI from a well-known vocabulary (UCUM) > > 2) A encoded UoM with one URI, and a simple label > > 3) ditto, with a more complex set of details > > 4) ditto with more that one URI (e.g. UCUM and QUDT) > > 5) a blank/anonymous encoded UoM with base measures. > > > > Would we go so far as to recommend QUDT as the meta-model (as per example > provided?) - or simply list a few in use and provide a couple of examples? > > > > This will cover the "follow-your-nose" cases - however there is the case > of a data encoding where the UoM is specified in metadata. The question > here then is defining a BP for this metadata. > > One option - we can use RDF-QB to define data structures and relevant UoM. > I'm not sure there is an obvious alternative to ad-hoc metadata models and > UoM specified any non-interoperable way that emerges. > > > > This option then speaks directly to the coverages metadata perspective > (encoding of data using RD-QB becomes a trivial case - we simply state that > if RDF encoding, then BP would be to use RDF-QB encoding consistent with > the RDF-QB metadata for the set, and the interesting and more generally > useful case is describing an existing or compact encoding usefully) > > > > Rob > > > > On Sat, 2 Jul 2016 at 02:20 Jon Blower <j.d.blower@reading.ac.uk> wrote: > > Hi Rob – yes, I think those are the missing bits, but, just to reiterate, > it may not be (just) a “vocabulary” that we need (in the sense of a set of > URIs), but a serialisation scheme for any unit. > > > > For concrete examples, we should look at where we need to use units. I > think we have: > > > > 1. As part of coordinate systems and coordinate reference systems > > 2. As part of measured quantities (e.g. the range of a coverage), > linked to observed properties etc > > 3. … > > > > My last paragraph wasn’t very clear, sorry. I was trying to say that the > different uses (coordinate systems, observed properties) might actually > have different best practices in terms of the encoding of their units. We > could feasibly decide that coordinate system units are best expressed as > URIs, but the units of observed properties are better expressed as strings > in a named serialisation scheme (like UCUM). Maybe, I don’t know – just > raising the possibility. > > > > Cheers, > Jon > > > > > > *From: *Rob Atkinson <rob@metalinkage.com.au> > *Date: *Friday, 1 July 2016 14:39 > *To: *Jon Blower <sgs02jdb@reading.ac.uk>, Rob Atkinson < > rob@metalinkage.com.au>, "Simon.Cox@csiro.au" <Simon.Cox@csiro.au>, Maik > Riechert <m.riechert@reading.ac.uk>, "public-sdw-wg@w3.org" < > public-sdw-wg@w3.org> > > > *Subject: *Re: Units of Measure (BP, Coverages, SSN,Time?) > > > > This is the type of recommendation i think we need. Lets refine... the > missing bits are: > 1 guidance on what vocabulary.. even noting that different communities use > different ones and naming them is a help. > 2 provision of mappings if you want to interoperate across community > choice here.. do you embed multiple uris, or provide sone sort of sameAs > service? > 3 concrete examples > > I dont quite follow the final paragraph and the implications for what the > encoding would look like? > > Rob > > > > On Fri, 1 Jul 2016 11:12 am Jon Blower <j.d.blower@reading.ac.uk> wrote: > > Just to add a little to this – units of measure are very tricky in > general. The overall requirement, I think, is to have an unambiguous > serialisation scheme for units, including both base units (the easy cases) > and the infinite number of derived units (the hard cases) – that is to say, > a spec for serialising units to ASCII strings. This allows clients to > convert between units, which is a primary use case for having “strongly > typed” units. > > > > In terms of serialisations, I’m aware of UCUM and UDUNITS (the latter is > used extensively in climate/met/ocean and is connected with CF). I don’t > think either are perfect in terms of governance, and I’m not even sure that > UDUNITS has a formal spec. > > > > Then there are URIs. QUDT has URIs for a lot of base and derived units, > but it can’t possibly have them all, hence the need for a scheme that > allows any unit to be serialised. So there will always be gaps, but I note > that QUDT covers a lot of the common cases I can think of – so it’s not > clear to me how important the gaps are. > > > > Typical clients will just want to display the symbol for the unit, so we > should make sure that, if we use URIs, we also transmit the symbol, as I > doubt that a typical web client will want to resolve the URI and look up > the symbol. This is effectively what Maik is doing, by transmitting the > symbol plus a URI for the unit **scheme** rather than a URI for the unit > itself. > > > > (Question – does QUDT use UCUM as a means of generating the unit symbol?) > > > > There are a few tricky cases in science – e.g. salinity, which strictly > has no units and is a very weird kind of quantity – and sometimes these > tricky cases lead to poor practice in real data files – i.e. expressing > units incorrectly or inconsistently. (and of course, poor practice can > happen in real-world data files anywhere). > > > > I think an overall BP recommendation would be: > > > > 1. Express units unambiguously if possible, using a named unit > serialisation scheme or URI. > > 2. Give the unit symbol, and perhaps a longer explanatory text > string (e.g. a rdfs:label), to help simple clients understand the unit, > even if they don’t want to resolve the full unit description. > > 3. Also allow users to record “ad hoc” unit strings for fallback > cases that don’t fit well with existing serialisation or URI schemes, > making it clear that these are not really machine-understandable > > > > There may be cases where we can refine this further depending on the use > case. For example, in CRS definitions, which tend to use simple units, it’s > probably desirable to use well-known URIs to represent units. For recording > the units of a measured quantity (e.g. the range of the coverage), I like > methods like the one Maik suggested, as this maps more neatly to common > practice in my community. > > > > Cheers, > > Jon > > > > > > *From: *Rob Atkinson <rob@metalinkage.com.au> > *Date: *Friday, 1 July 2016 08:46 > *To: *"Simon.Cox@csiro.au" <Simon.Cox@csiro.au>, "rob@metalinkage.com.au" > <rob@metalinkage.com.au>, Maik Riechert <m.riechert@reading.ac.uk>, " > public-sdw-wg@w3.org" <public-sdw-wg@w3.org> > > > *Subject: *Re: Units of Measure (BP, Coverages, SSN,Time?) > > *Resent-From: *<public-sdw-wg@w3.org> > *Resent-Date: *Friday, 1 July 2016 08:47 > > > > Perfect Simon - thanks. > > Its not that obvious trawling the docs what the pragmatic aspects are. > > > > So I would suggest then that a BP endorsed by OGC would have a minimum > requirement that a mapping to UCUM is provided for any vocabulary used for > UoM, to provide for compatibility with existing recommendations (can we > call these BP?) > > > > If it helps I could set up a OGC resource for UCUM - with redirects for > specific terms - instead of to the containing spec (thats the way UCUM > works) - or to a SKOS resource with skos:exactMatch relationships to the > UCUM terms. I can also deploy a crosswalk to UCUM from another UoM vocab > if we decide to recommend it. > > > > The onoging governance of such a resource in the context of the BP can be > taken up as a action from the SDW to the OGC (what is the appropriate point > of contact here? NA, OAB, TC, PC?) > > > > Rob > > > > On Fri, 1 Jul 2016 at 16:10 <Simon.Cox@csiro.au> wrote: > > Ø If OGC has adopted UCUM as a BP (can someone make a definitive > statement on this … > > > > OGC’s endorsement of UCUM comes from > > 1. It is recommended in WMS [1] > > 2. Ditto GML [2] > > 3. There is a branch of the www.opengis.net/def/ URI set for UCUM - > http://www.opengis.net/def/uom/UCUM/ but just redirects to the UCUM spec > [3] > > > > But that is purely pragmatic, as it seemed to be the best thing around at > the time. > > It has a fragile governance arrangement, and URIs are not > de-referenceable. > > > > [1] http://www.opengeospatial.org/standards/wms version 1.3 clause C.2. > > [2] http://www.opengeospatial.org/standards/gml v3.2.1 clause 8.2.3.6 > > [3] http://unitsofmeasure.org/ucum.html > > > > *From:* Rob Atkinson [mailto:rob@metalinkage.com.au] > *Sent:* Friday, 1 July 2016 1:46 AM > *To:* Maik Riechert <m.riechert@reading.ac.uk>; Rob Atkinson < > rob@metalinkage.com.au>; SDW WG Public List <public-sdw-wg@w3.org> > *Subject:* Re: Units of Measure (BP, Coverages, SSN,Time?) > > > > Thanks Maik, > > > > If i read this right, this example assumes the client understands qudt - > then uses the semantics of qudt:symbol to map instances (Cel) in another > namespace to this. UCUM uses > http://purl.oclc.org/NET/muo/ucum/unit/temperature/degree-Celsius as the > id - but the information to map to that is not present. Is "Cel" just a > dummy example - would you actually want to say "degree-Celsius" - and in > turn want the OGC redirect to respect that and redirect > > http://www.opengis.net/def/uom/UCUM/degree-Celsius to > http://purl.oclc.org/NET/muo/ucum/unit/temperature/degree-Celsius? > > > > What about the original assumption of using QUDT - why not use UCUM or > another in the first instance. Coming from the outside and trying to > identify a best practice, what exactly is this example saying? > > > > If OGC has adopted UCUM as a BP (can someone make a definitive statement > on this - it should be present in the BP when we talk about vocabulary > re-use - a list of vocabularies in use in the OGC space) then we should > start with that perhaps? If we are saying the BP requirement is to allow an > emerging body of QUDT usage to interoperate then we need perhaps to > recommend publishing the mappings as a resource - whatever we think is BP > we need to communicate clearly to the average user who wont have years of > exposure to the history and details to draw on - and will most likely > simply want to maximise interoperability of a few cases. > > > > Cheers > > Rob > > > > On Fri, 1 Jul 2016 at 01:00 Maik Riechert <m.riechert@reading.ac.uk> > wrote: > > Hi Rob, > > I just wanted to throw in a slightly different/complementary view on this. > > While it is useful to have URIs for any kind of unit, I think it is even > more useful to have a symbolic coding in a certain coding scheme for those > units, because then clients with support for that scheme can easily parse > the unit, and transform it and the associated numbers. One scheme example > is UCUM (http://unitsofmeasure.org/ucum.html). OGC gave it a URI as well: > http://www.opengis.net/def/uom/UCUM/ > > In my opinion you would have something like that (JSON-LD): > > { > "@context": { > "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" > <http://www.w3.org/1999/02/22-rdf-syntax-ns>, > "qudt": "http://qudt.org/schema/qudt#" <http://qudt.org/schema/qudt>, > "skos": "http://www.w3.org/2004/02/skos/core#" > <http://www.w3.org/2004/02/skos/core> > }, > "rdf:value": 27.5, // for example purposes only > "qudt:unit": { > "@id": "qudt:DegreeCelsius", > "skos:prefLabel": { "en": "Degree Celsius" }, > "qudt:symbol": { > "@type": "http://www.opengis.net/def/uom/UCUM/" > <http://www.opengis.net/def/uom/UCUM/>, > "@value": "Cel" > } > } > } > > So the main point is that the value of "qudt:symbol" has a custom data > type, in this case http://www.opengis.net/def/uom/UCUM/. > > Cheers > > > Maik > > > > Am 30.06.2016 um 15:14 schrieb Rob Atkinson: > > Hi, > > > > I'm looking into the BP aspects around defining data dimensions as a > framework for evaluating and contributing to various SDW threads. One which > seems to cut across, but I havent seen an explicit treatment of the UoM > problem. I know I may have missed previous conversatiosn - but I dont see > any treatment in the current reviewable docs. > > > > Specifically, if I was to follow the W3C Data on the Web Best Practices I > would be led via BP #2 > > > > "To express frequency of update an instance from the Content-Oriented > Guidelines developed as part of the W3C Data Cube Vocabulary efforts was > used." > > > > to this statement: > > "To express the value of this attribute we would typically use a common > thesaurus of units of measure. For the sake of this simple example we will > use the DBpedia resource http://dbpedia.org/resource/Year which > corresponds to the topic of the Wikipedia page on "Years". > > > > If we have a Time ontology - surely we would be pointing to that as a > recommendation for temporal units of measure. > > Likewise, i would have thought that OGC would have an interest in binding > CRS with their in built units of measure to spatial dimensions. > > One could argue that without interoperability at this level there is a > question why the OGC would have any involvement in Web standards - but if > there is a counter-argument then I feel this needs to be front-and-centre > of the BP to explain to a potential user what they can expect, and where > they are going to be left with making all the significant decisions. > > > > If we have Time and CRS UoM, then we may be able to get away with not > specifiying a vocabulary for other UoM for measurements. Are there any > obvious dimensions that need UoM vocabularies? > > > > When I specify O&M profiles, (my driving use case), I'll need to specify > the UoM for measurements - is there any recommendation regarding which > vocabulary to choose? And for CRS based dimensions? > > > > Rob Atkinson > > > > > > > >
Received on Thursday, 7 July 2016 21:57:10 UTC