- From: Little, Chris <chris.little@metoffice.gov.uk>
- Date: Thu, 23 Nov 2017 14:09:57 +0000
- To: "public-sdwig@w3.org" <public-sdwig@w3.org>, Bill Roberts <bill@swirrl.com>
Dear Bill and Stats BP colleagues, 1. Many container data formats, and even service APIs and protocols, have controlled lists/taxonomies of parameters/observations/variables/measurements. These values of interest may be scalar, vector or even tensor valued. E.g. surface atmospheric pressure, sub-surface ocean current velocity or wind stress (used to forecast ocean waves), respectively. In Meteorology and Oceanography, these lists have been maintained globally, in multiple languages, for decades. Three major container formats that use these kinds of lists are: NetCDF - a generic format with a large ecosystem of tools and applications, and several conventions for metadata, such as CF http://cfconventions.org/Data/cf-standard-names/46/build/cf-standard-name-table.html and COARDS; GRIB - a similar, more compact operational format for multidimensional gridded data, with tightly controlled lists/tables managed by WMO, see http://www.wmo.int/pages/prog/www/WMOCodes/WMO306_vI2/LatestVERSION/WMO306_vI2_GRIB2_CodeFlag_en.pdf Code Table 4.2; BUFR - another WMO operational format, suitable for point, line and polygon like features, with thousands of entries in its controlled lists, see http://www.wmo.int/pages/prog/www/WMOCodes/WMO306_vI2/LatestVERSION/WMO306_vI2_BUFRCREX_TableB_en.pdf . To keep these lists manageable, and to avoid combinatorial explosions of possibilities, attributes or qualifiers have been constructed so that various derived statistics of the parameters can be indicated in the metadata, such as mean, median, standard deviation, variance, etc., without creating new entries. These schemes are incomplete, as second and higher order statistics, such as quartiles, quintiles, deciles and even percentiles of a parameter distribution are routinely used, but there is no standard scheme of creating and applying these qualifiers. It is best practice in meteorology and oceanography to forecast a range of values, known as an ensemble, for a parameter of interest, and then extract various statistics and threshold values. The ensembles typically have 50 -100 members. The various schemes and the controlled lists are also inconsistent, as, for example, one strategic policy has been to generate extra entries for commonly used statistics of parameters, so the registries may contain both (instantaneous) wind speed, and mean wind speed, for example. The use case, or more precisely, a requirement, is to have a standard statistical scheme that allows the consistent and rigorous generation of a variety of statistical qualifiers to create useful and machinable metadata to qualify lists of parameters in a variety of domains. Chris Chris Little Chair, OGC Meteorology & Oceanography Domain Working Group Member OGC Architecture Board IT Fellow - Operational Infrastructures Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom Tel: +44(0)1392 886278 Fax: +44(0)1392 885681 Mobile: +44(0)7753 880514 E-mail: chris.little@metoffice.gov.uk http://www.metoffice.gov.uk I am normally at work Tuesday, Wednesday and Thursday each week
Received on Thursday, 23 November 2017 14:10:36 UTC