Re: use case: NetCDF data

Jeremy sounds great.

Eric

On Wed, Feb 26, 2014 at 2:54 AM, Tandy, Jeremy
<jeremy.tandy@metoffice.gov.uk> wrote:
> OK - now you're talking :-)
>
> At the Met Office (& many of the organisations we collaborate with), NetCDF is in common usage.
>
> I had thought that you wanted to get into working directly with the binary ... in which case I was going to say NetCDF3, NetCDF4-Classic or NetCDF-Extended :-) ... I think I know too much (at least in this corner of the data-sphere).
>
> For what it's worth, our files get pretty large (I suspect yours do too!), so I don't see much use of ncdump to generate a text format. Instead, we encourage people to work directly with the binary format using appropriate tools like Iris <http://scitools.org.uk/iris/> that layer on top of the netCDF libraries to work with data conforming to the CF (Climate and Forecast) metadata conventions <http://cf-pcmdi.llnl.gov/>.
>
> On the mailing list <http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0157.html>, danbri talked about "interesting and useful work that can be done for _all_ tables, at a broad brush level of granularity" without getting stuck into the innards of the file. Just having access to this summary and/or structural metadata would help discovery.
>
> So I think the use case could follow these lines:
>
> - big scientific tabular dataset published in netCDF
> - publish summary and structural metadata about the "tabular data" for LOD discovery purposes
> - download manageable subsets of the tabular dataset in tabular-text format - perhaps paginated?
>
> This is not dissimilar to use case #7 <https://www.w3.org/2013/csvw/wiki/Use_Cases#A_local_archive_of_metadata_for_a_collection_of_journal_articles> in which a tabular result set is described but presented in manageable "pages" to the user one CSV file at a time.
>
> As for #7 we need to be wary of getting into PROTOCOL DESIGN for accessing subsets of data ... or at least do so with our eyes open!
>
> What do think Eric ... could you turn this into a narrative-style use case?
>
> Jeremy
>
> -----Original Message-----
> From: Eric Stephan [mailto:ericphb@gmail.com]
> Sent: 26 February 2014 10:25
> To: Stasinos Konstantopoulos
> Cc: Ivan Herman; Tandy, Jeremy; W3C CSV on the Web Working Group
> Subject: Re: use case: NetCDF data
>
> Sorry trying to keep the gates of hell open a bit still. :-)   I think
> I'm on board.
>
> Instead of "tabular text files", I'd prefer to view the constraint as "tabular text data"
>
> From a LOD perspective a given NetCDF resource could be accessible:
>     *  in its native format
>     * expressed in its text form through the ncdump utility.
>
> The Native NetCDF form provides the scientific community self describing data.
> The text form could be used to within the CSVW context as a means for Star 4 and Star 5 discovery.
>
> There are lots of scientific binary formats out there that represent n-dimensional data blocks but each usually provide a means to dump in a textual form or be expressed in an alternative format that can be loaded into a spreadsheet for analysis.
>
> Its somewhat of a similar concept as thinking of a relational database table or triple store dumped in a tabular form.
>
> Sound okay?
>
> Eric
>
>
>
> On Wed, Feb 26, 2014 at 1:05 AM, Stasinos Konstantopoulos <konstant@iit.demokritos.gr> wrote:
>> On 26 February 2014 10:40, Ivan Herman <ivan@w3.org> wrote:
>>>
>>> On 25 Feb 2014, at 23:18 , Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk> wrote:
>>>
>>>> Hi Stasinos ... thanks for the use cases you've provided so far.
>>>>
>>>> Looking at your NetCDF data use case, I wonder if non-textual tabular data is in scope. The discussion on the "Scoping Question" thread in the mailing list seemed to suggest that we would focus on textual tabular data.
>>>>
>>>> Before progressing, I wanted to get your thoughts and gather input from the other WG participants.
>>>
>>> Well... I did not know NetCDF before, so I peeked around a bit. I may have missed some details, but the impression is that this is, primarily, a set of utilities in various programming languages to handle tabular data that is in some internal format. They do have some ways of dumping data in terms of text:
>>>
>>> http://www.narccap.ucar.edu/data/ascii-howto.html
>>>
>>> and, as far as I could see some of the examples there the output is 'simply' CSV (well, probably TSV or 'SSV', ie, 'space separated values').
>>>
>>> I would support Jeremy's formulation, that we focus on 'textual tabular data'.
>>
>> That's alright, and it will keep hell's gates slightly less widely
>> open. We still have Eric's examples of metadata headers.
>>
>> NetCDF also forsees a single metadata description covering multiple
>> data files, but I believe this to be a more general concern as there
>> are many instances of homogeneous CSV data files that can better be
>> described at one shot.
>>
>> Best,
>> s
>>

Received on Wednesday, 26 February 2014 13:00:34 UTC