Re: Absence of key scientific spatial data formats within common formats to implementation of Best Practices [SEC=UNCLASSIFIED]

Hi Lewis

Thanks for the very useful input on scientific data formats.  I don't
remember exactly the background to the list of spatial formats that was the
starting point for this thread, but although those formats like GML,
GeoJSON etc can encode 'data', I (and perhaps others?) tend to think of
those as formats for 'geometry' whereas I think of NetCDF as a format for
'data'.

Anyway, I don't want to get into that, I just want to note that I'm one of
the editors of the Coverages sub-group and the points you have raised in
recent emails are very relevant for that.  The main issues we are
discussing in that group relate to 'web-friendly' formats for coverage data
(whatever web-friendly turns out to mean in that context!) and approaches
to identifying and retrieving extracts (aka subsets) of Coverage data.

For both of those, it would be great to get your input and to make sure we
take due consideration of use cases that are important to JPL, NASA etc.

Cheers

Bill



On 30 March 2016 at 07:38, lewis john mcgibbney <lewismc@apache.org> wrote:

> Hi Bruce,
> Replies inline
>
> On Mon, Mar 28, 2016 at 7:21 PM, Bruce Bannerman <B.Bannerman@bom.gov.au>
> wrote:
>
>>
>>
>> Hi Lewis,
>>
>> More inline below.
>>
>> Bruce
>>
>>
>> Regarding describing our datasets:
>>
>>    - We don’t do this as well as we could. We will begin addressing this
>>    in the near future. Much of our current work is either internally focussed,
>>    or at a much too granular level.
>>
>>
> Understood. This also seems to be quite a common observation from the
> datasets (and the platforms which make this data available) I work with so
> I acknowledge the point.
>
>>
>>    - We intend describing our data sets using ISO 19115 with support for
>>    several profiles, including WMO and ANZLIC.
>>
>> I believe the OCO2 products (and many others) I've worked with also came
> with ISO 19115 metadata within the data product. These products were HDF5.
> I am familiar with the ISO standard(s) as well.
> http://oco.jpl.nasa.gov/science/ProductInfo/#
>
>
>>
>>    - I can’t see us moving away from this paradigm, but there is
>>    certainly potential for LinkedData approaches as alternate methods of
>>    discovering our data.
>>
>> Agreed. This is where my work in this group is (again) justified.
>
>
>> But this is also only part of the issue:
>>
>>    - We also need a mechanism to better understand the context of our
>>    observations (e.g. What sensor; what model; when was it last calibrated;
>>    maintained; what sensor maintenance process and responsible party; what
>>    observation process etc). We will be using the new WMO WIGOS Observations
>>    Metadata standard to support this concept.
>>
>> Interesting. Have you looked at encoding this into any linked data
> approach as of yet?
>
>
>>
>>    - As discussed before on this list (and in the SDWWG Climate data
>>    related use case), there is also the issue of data provenance.
>>
>> Yes there sure is. We make heavy use of the PROV-ES specification here
> @JPL for such requirements.
>
>>
>>    - Data Quality and IP issues will also become a big issue,
>>    particularly with the increasing use of mixed Bureau and 3rd party
>>    observations and the subsequent derived products that we create from these
>>    observations.
>>
>> Derived and value added products are certainly in high demand for new(er)
> data products which are made available, however without an expansion on
> this topic I see this more as a process issue rather than one relating to
> the spatial data format itself. This is absolutely OK though. If you feel
> like expanding then I am all ears. I see that such topics feature heavily
> within the CDMS spec you posted below. These are relevant topics indeed
> however I'll state that I am not sure they feature on the current agenda
> for this WG.
>
>
>>
>>
>>
>>> Further, I expect that we'll need to go further and work with our peers
>>> to agree on semantic definitions of the content that we portray for each
>>> relevant domain and its inter-relationships with other domains.
>>>
>>
>> This sounds like the next step... the issues we're discussing above seem
>> like the precursor. Am I correct?
>>
>>
>> Not necessarily, consider the work that has been undertaken on GeoSciML,
>> WaterML etc.
>>
>> A lot of this is based on communities of a common interest getting
>> together and agreeing on and using common terms and concepts.
>>
>> It takes many, many years of community building to reach the required
>> consensus.
>>
>
> OK, I was just trying to bring it back to what we can achieve within the
> maneuverability and scope of this WG.
>
>
>>
>>
>> Do you have any examples from the field of Meteorology? i would be
>> interested to see if I could pick out any examples more familiar to other
>> aspects of Earth Science, Pysical Oceanography or something else a bit
>> closer to 'home' for my current working agenda.
>>
>>
>>
>> The closest that I can point to at the moment is the work that we have
>> been doing in WMO on WMO #1131, Climate Data Management System
>> Specifications h
>> ttp://library.wmo.int/opac/index.php?lvl=notice_display&id=16300
>> <http://library.wmo.int/opac/index.php?lvl=notice_display&id=16300>
>>
>
> Wow this is a meaty, very substantial document. It will take me a while to
> read as it's the first time I've seen it. I undertook a preliminary search
> for 'data access' and 'access' and it returned a few results so I will
> scope them out and see what interesting content I can muse over.
>
>
>>
>> There is also related work, e.g.:
>>
>>    - Foundation data governance and data modelling work within WMO that
>>    Jeremy Tandy is leading
>>    - Foundation work that has been undertaken by Australia’s CSIRO over
>>    many years: https://www.seegrid.csiro.au/wiki/Siss/WebHome
>>    - And to be honest, much of the underpinning OGC standards efforts
>>    that we build on top of.
>>
>>
>> This is really laying the groundwork, and it will take many years to get
>> there with truly federated data and data services.
>>
>
> So what are your thoughts then about how this all fits in with one or more
> of the aims of this WG? When worded like it has been above, this
> scientifici data angle (which you, I and a few others are coming from)
> seems to be somewhat different from the other working group members. It is
> certainly a different conversation we are having here from what I have seen
> or heard going on elsewhere in this WG. I've also checked the WG mailing
> list archives are there is very little conversation at all about scientific
> data formats within the overall context of this WG.
>
> Thanks. I am glad to see that this thread is now picking up some traction.
> Lewis
>

Received on Wednesday, 30 March 2016 08:42:53 UTC