- From: Ed Parsons <eparsons@google.com>
- Date: Tue, 29 Mar 2016 08:55:36 +0000
- To: Linda van den Brink <l.vandenbrink@geonovum.nl>, Bruce Bannerman <B.Bannerman@bom.gov.au>, lewis john mcgibbney <lewismc@apache.org>
- Cc: SDW WG Public List <public-sdw-wg@w3.org>
- Message-ID: <CAHrFjcn-5an8H-cdfi2Z7gALNTwoFYU=g2E+jQghDxQQ+4+nmA@mail.gmail.com>
Hello all, This is worth discussing in the context of the intended audience for the BP document, my personal view is that those formats you list are of interest to a predominantly small (in relative terms) scientific audience ? Ed On Tue, 29 Mar 2016, 09:11 Linda van den Brink, <l.vandenbrink@geonovum.nl> wrote: > Hi Bruce, Lewis, > > > > The scope and purpose of the common formats list in the BP hasn’t been > discussed exhaustively. What’s currently in the BP is a first draft, or > rather two; one list by Ed Parsons and one by Clemens Portele. Including > scientific spatial data formats in these lists hasn’t come up yet. It > could be argued that scientific formats weren’t considered ‘common’ > formats, but as I said it hasn’t come up yet. Both lists only list vector > data formats. > > > > I have created an issue about this so that we don’t forget to address > this. > > https://github.com/w3c/sdw/issues/237 > > > > Linda > > > > *Van:* Bruce Bannerman [mailto:B.Bannerman@bom.gov.au] > *Verzonden:* dinsdag 29 maart 2016 04:21 > *Aan:* lewis john mcgibbney > *CC:* SDW WG Public List > *Onderwerp:* Re: Absence of key scientific spatial data formats within > common formats to implementation of Best Practices [SEC=UNCLASSIFIED] > > > > Hi Lewis, > > > > More inline below. > > > > Bruce > > > > > > *From: *lewis john mcgibbney <lewismc@apache.org> > *Date: *Saturday, 26 March 2016 at 08:24 > *To: *Bruce Bannerman <B.Bannerman@bom.gov.au> > *Cc: *SDW WG Public List <public-sdw-wg@w3.org> > *Subject: *Re: Absence of key scientific spatial data formats within > common formats to implementation of Best Practices [SEC=UNCLASSIFIED] > > > > Hi Bruce, > > Thanks for your response. I'll make some comments below > > > > On Wed, Mar 23, 2016 at 1:26 PM, Bruce Bannerman <B.Bannerman@bom.gov.au> > wrote: > > Hi Lewis, > > I have still to find the time to review the latest document and provide > comment. > > > > OK. I would like to see your comments when you do and I suppose I will > have some replies. > > > > > However regarding this issue, we have no intention of moving away from the > scientific data formats that we use within our large data holdings. > > > > Same here. This is where I see some value in addressing the following area > in order to still drive value from the spatial data encoded within the > dataset(s). Many (virtually all) of our dataset landing pages, believe it > or not, still do not have any kind of semantic markup hence are relatively > undiscovered outside of the NASA Data Active Archive Center (DAAC) portals. > An example is the landing page at [0] which describes the SeaWinds on > QuikSCAT Enhanced Resolution Regionally Gridded Sigma-0 (BYU, D. Long) > dataset. When I extract the implicit semantic markup from within this page > (using Apache Any23 [2]) I get very few meaning relationships which I can > utilize programmatically. I extracted result in JSON shows you that. > > I do however also think that moving towards a hypermedia-based mechanism > for describing the data granules behind these dataset landing pages is also > useful. I found Linda's recent post on the Dutch crawling task very > interesting in this regard. > > What are your thoughts here? Do you describe your datasets in any > meaningful way? I think that there is a HUGE a mount of work to be done > here to improve programmatic interpretation of the underlying scientific > data. > > > [0] > http://podaac.jpl.nasa.gov/dataset/QUIKSCAT_BYU_L3_OW_SIGMA0_ENHANCED?ids=Measurement&values=Sea%20Ice > [1] http://any23.apache.org > [2] https://paste.apache.org/lphl > > > > > > > > Regarding describing our datasets: > > - We don’t do this as well as we could. We will begin addressing this > in the near future. Much of our current work is either internally focussed, > or at a much too granular level. > - We intend describing our data sets using ISO 19115 with support for > several profiles, including WMO and ANZLIC. > - I can’t see us moving away from this paradigm, but there is > certainly potential for LinkedData approaches as alternate methods of > discovering our data. > > But this is also only part of the issue: > > - We also need a mechanism to better understand the context of our > observations (e.g. What sensor; what model; when was it last calibrated; > maintained; what sensor maintenance process and responsible party; what > observation process etc). We will be using the new WMO WIGOS Observations > Metadata standard to support this concept. > - As discussed before on this list (and in the SDWWG Climate data > related use case), there is also the issue of data provenance. > - Data Quality and IP issues will also become a big issue, > particularly with the increasing use of mixed Bureau and 3rd party > observations and the subsequent derived products that we create from these > observations. > > > > > > > If anything, I expect that we will need to work with our peers to define > formal data format definitions that are consistent with modern spatial > requirements, e.g. full support for Spatial Reference Systems and other CRS > definition and that don't constrain our ability to adequately portray the > complexity of our data. > > > > I agree here. > > > > I expect that we'll probably need to do this via OGC processes. We want to > ensure that the data that we collect and archive now will still be > accessible for our key stakeholders who have not yet been born. > > > > This view is consistent across the entire NASA data archival spectrum as > well... and as long term data stewards this is a logical viewpoint. > > > > > Further, I expect that we'll need to go further and work with our peers to > agree on semantic definitions of the content that we portray for each > relevant domain and its inter-relationships with other domains. > > > > This sounds like the next step... the issues we're discussing above seem > like the precursor. Am I correct? > > > > Not necessarily, consider the work that has been undertaken on GeoSciML, > WaterML etc. > > > > A lot of this is based on communities of a common interest getting > together and agreeing on and using common terms and concepts. > > > > It takes many, many years of community building to reach the required > consensus. > > > > > > > > > > > > This is similar in concept to what the hydrology community have done with > WaterML 2, but I expect that we'll need to take it further, particularly > the inter-domain relationships. > > > > Yes, I really like ongoing work on hydrology with WaterML2 and this is an > excellent point. It is however again, in my own opinion, something which > follows on from he above. > > > > > When we are trying to understand global systems and their interaction with > other systems, and we are doing this with our peers in distributed data > collections and services, the need for formal data definitions become > critical. This is especially so if we want global, federated, data sets > ***and dynamic services*** describing specific phenomena. > > > > Do you have any examples from the field of Meteorology? i would be > interested to see if I could pick out any examples more familiar to other > aspects of Earth Science, Pysical Oceanography or something else a bit > closer to 'home' for my current working agenda. > > > > > > The closest that I can point to at the moment is the work that we have > been doing in WMO on WMO #1131, Climate Data Management System > Specifications h > ttp://library.wmo.int/opac/index.php?lvl=notice_display&id=16300 > <http://library.wmo.int/opac/index.php?lvl=notice_display&id=16300> > > > > There is also related work, e.g.: > > - Foundation data governance and data modelling work within WMO that > Jeremy Tandy is leading > - Foundation work that has been undertaken by Australia’s CSIRO over > many years: https://www.seegrid.csiro.au/wiki/Siss/WebHome > - And to be honest, much of the underpinning OGC standards efforts > that we build on top of. > > > > This is really laying the groundwork, and it will take many years to get > there with truly federated data and data services. > > > > > > > > > > > > It will allow us to spend much less wasted time in getting data prepared > for global analysis and much more time on the actual analysis and > understanding the implications of the results. > > Agreed! > > Thank you for the very meaningful conversation. Looking forward to any > follow up if you have it. > > In the meantime, I come back to my main question. Is there any reason from > across the group why the current matrix of spatial data formats doesn't > include formats such as GRIB, HDF4, HDF5, netCDF3, netCDF4, etc? These are > used pervasively throughout the sciences and I am very surprised to see > them absent. > > Thanks folks. > > Lewis > > > -- *Ed Parsons *FRGS Geospatial Technologist, Google Google Voice +44 (0)20 7881 4501 www.edparsons.com @edparsons
Received on Tuesday, 29 March 2016 08:56:14 UTC