Re: Beginnings of a draft report

Hi Phil,

Thanks for looking at the spec. Some things to be aware of:
- It is not finished and pretty much experimental, especially sections 
after 6.
- Yesterday I did a brain dump of all outstanding issues into GitHub 
issues: 
https://github.com/Reading-eScience-Centre/coverage-restapi/issues 
Please have a look (no, really!), it gives you a good overview of its 
current state and lists things that also you as non-expert may have 
solutions to since some is just webby stuff (#10, #11, #12, #3, ..).
- The spec is how I would like data to be made available, but it is not 
the reality currently (except basic things like section 2 for example), 
especially when looking at OGC W*S services which won't disappear any 
time soon.

Now, a try at preliminarly answering some of your questions:

"- do the defined methods successfully and robustly address the issue of 
scale and subsetting?"

I assume you mean data volume with "scale".
There are many variables here when talking about subsetting. First thing 
is: along which dimensions? How do you semantically define a dimension? 
What if you have two time dimensions? (see #5) I defined my own custom 
terms for subsetting (targeted to MELODIES use cases, which evolve over 
the project):
http://coverageapi.org/ns#subsetBbox (with an OpenSearch Geo bounding 
box as value)
http://coverageapi.org/ns#subsetTimeStart (an OpenSearch Time "start" 
time as value)
...
So I wouldn't call that robust. And I already see issues here (again #5) 
because OpenSearch Geo & Time (see #5) is restricted to WGS84 and 
Gregorian RFC3999 times (4-digit years) which is not enough in some cases.
I think OpenSearch is not the best idea for subsetting. But again, for 
some cases the limitations of it don't apply, so it may work just fine.


"- Is the approach taken directly mappable to Linked Data techniques 
and, if so, what might be gained by doing so?"
The question is too vague for me, but I'll try to answer anyway.
If you mean, giving URIs to things and linking between them, then 
probably yes, but it happens on different levels (HTTP Link headers, 
embeddded links within the data (JSON-LD or other means)), and often not 
all levels simultaneously. The recommendations I make in the spec have 
the goal that no matter at which URL you start, you can discover the 
rest of it (subsetting, parent dataset etc.) by some means. I would call 
that linked data.
In a more strict sense, it promotes JSON-LD/RDF and the draft Hydra 
vocabulary for cases where Link headers are just not powerful enough 
(URL templates mostly). If there were common terms (and definitions) for 
URL parts for coverage filtering and subsetting that could be used in 
URL templates, then it would be easier to develop clients against 
various coverage APIs. This is equal to the OpenSearch Geo & Time effort.


"- The SDW WG’s BPs talk about exposing the contents of WxS services as 
Web pages that can be indexed. Coverage Data Rest API addresses this up 
to a point by talking about human readable Web pages for the metadata - 
can we go further and talk about exposing subsets of coverages as 
crawlable (i.e. discoverable) Web pages?"

I'm not sure I understand this one. If you have a GeoTIFF binary file, 
how would you dump that to a web page? I suppose you mean linking to it? 
Need more details here.


"- Just because a coverage is serialised in JSON does that mean it is 
accessible to non-specialists? What more can/should be done?"

Absolutely not, although it may help. In my opinion the problem with 
complex and binary formats is that it scares people if it's not a 
standard format they know, like image or video, pdf etc. I would say 
GeoTIFF is the least scary one.
Our motivation with CoverageJSON 
(https://github.com/Reading-eScience-Centre/coveragejson/blob/master/spec.md) 
is that it should be fun to create and easy to understand. Have a look 
at http://reading-escience-centre.github.io/leaflet-coverage-demo/ and 
play around (also press the "Direct Input" button in the bottom left). 
And now tell me how it feels. I guess it still feels a bit alien since 
there may be unfamiliar concepts like domain, range, axes or whatever. 
But at least you can directly grab it, modify the JSON, and see what 
changes, basically learning by doing, which would be a lot more 
complicated with formats like netCDF, HDF, GeoTIFF etc. So in a way it 
is also a format for education in my mind where you can teach coverage 
concepts (if the model that CoverageJSON uses fits your purposes, e.g. 
the split between domain/range). The goal is to have a low entry barrier 
for browsers and web developers, while still being useful for many 
bigger-data cases when used wisely.


About standardization, we try to make CoverageJSON the standard JSON 
output format of OGC WCS, but we're not there yet. About the REST API 
spec, I haven't thought about that yet since it is still a very loose 
document and needs more work, maybe you can suggest a direction to go. I 
think it would be more useful to solve some of the more fundamental 
problems on which solutions the API spec has to depend on, I think the 
github issues cover that.

That's all for now,
Maik

Am 29.01.2016 um 17:41 schrieb Phil Archer:
> Dear all,
>
> In an effort to show progress and to ensure that our meeting in 
> Beijing on 28 February is as productive as possible, I have made a 
> tentative start on the document that I hope will evolve into our final 
> report (attached*).
>
> It begins by re-stating some of the stuff from our initial meeting as 
> captured in the report [1] and then makes a quick summary of the work 
> Maik has been doing in Reading.
>
> @Maik - from an inexpert and all too quick review of your Coverage 
> Data REST API Core Specification [2] it seems to me that you have 
> already answered a lot of our key questions. I have added some 
> questions to the doc that I suggest we can look at - I'd be delighted 
> if there are ready answers to some or all of them and would be 
> surprised if there weren't more that can be added.
>
> I am aware that this work has been done in one specific context and 
> that it is crucial, of course, to see how this works in the contexts 
> in which our Chinese colleagues work.
>
> @Maik - what is your/Reading/MELODIES' plan for this doc and its 
> companions? Are you looking for it to become a formal standard? If so, 
> we seem well placed to help ;-) You define some relationship types. 
> The next iteration of the BP doc that Jeremy is co-editing will, I 
> think, include the standard spatial and temporal relationships and, 
> via that route, can be added to the Link Registry.
>
> I am painfully aware that as I write, China is shutting up shop ready 
> for the Spring Festival. Beihang is now closed until 17th February and 
> I imagine that will be the case for CAS as well.
>
> Comments, questions and additions to the doc please.
>
> Thanks
>
> Phil
>
>
> * I considered doing this on GitHub or Google Docs but, for now at 
> least, a Word doc seems most likely to most convenient to the greatest 
> number of people. But I do hope we can move to an online shared space 
> quickly as passing round Word docs is a recipe for extra workload.
>
> [1] https://www.w3.org/2015/ceo-ld/kom
>
> [2] 
> https://github.com/Reading-eScience-Centre/coverage-restapi/blob/master/spec.md#coverage-data-rest-api-core-specification
>

Received on Monday, 1 February 2016 11:37:09 UTC