- From: Bill Roberts <bill@swirrl.com>
- Date: Mon, 13 Mar 2017 18:39:06 +0000
- To: "public-sdw-wg@w3.org" <public-sdw-wg@w3.org>, Dmitry Brizhinev <dmitry.brizhinev@anu.edu.au>, Sam Toyer <u5568237@anu.edu.au>, Kerry Taylor <kerry.taylor@anu.edu.au>
- Message-ID: <CAMTVsunjQjdmbJ5FY5s2PdGjDUMXCJ9SBtfAaqRcDEkEAHuqnA@mail.gmail.com>
Hi all I've had a detailed look through the editor's draft of EO-QB. Overall I think it's looking good but I made a few comments as I went through. I've made a few suggested changes to the wording here and there in this pull request: https://github.com/w3c/sdw/pull/609 Nothing really significant but I hope it might make things clearer and a little more precise in some places. There are some other comments or questions below that it might be interesting to address and perhaps discuss via the mailing list or in the next call. Hope that's useful Best regards Bill Example 4 - declares the range of the measure property to be xsd:anyURI but the example actually has a string as the value of that property. Maybe use <http://www.example.org/led-example-image-R000> instead? Section 3.2 What do you mean by: "With sufficiently advanced middleware, SPARQL queries over the dataset could be served just as if the data were stored in RDF, but for a fraction of the storage cost". I can't see a query against pixel values working in any reasonable amount of time if the middleware has to 'unpack' each image to look inside it in order to answer the query. There is a balance of speed vs data size here, and if you optimise for data size, then you would lose a lot of speed. So "The publisher can thus leverage the full power of Linked Data." seems a rash and unjustified claim here. Probably the less exciting sounding "The publisher can thus leverage some of the power of Linked Data" :-) "The RDF Data Cube provides only for “slices”". It's true that the RDF Data Cube defines a mechanism for 'materialising' a slice and linking all the observations to it. So if you want all the values in a slice, then there is an easy to evaluate SPARQL query that can get those. In practice a SPARQL query can just as easily get all observations where the value of a dimension is equal to a chosen value (i.e. a 'slice') so most people don't bother pre-defining slices. It just makes more triples for not much extra value - at least if you are serving data using SPARQL rather than pre-canning a lot of RDF files. If you wanted to query for all observations with a location inside a bounding box, then your query would have to do some inequality evaluation, which is a fair bit slower than an index look-up, so simple to write but slower to evaluate. You could perhaps do something like the 'tile' equivalent of a slice, by making a triple that linked an observation to a rectangular area. So you might not be able to answer "all pixels within 10km of Canberra" but you could make it quick to find "all pixels in the 10km x 10km area that contains Canberra". This kind of thing sounds like a good match to the DGGS approach. 4.1 - the description here of how a typical triple store works may be doing a disservice to the implementers of those databases! In general, a lot of those bindings will be evaluated by index look-ups. Most triple stores will have some kind of 'explain query plan' method that shows what the database is going to do, if you want to investigate the details. I'm certainly not an expert on this. This is quite an interesting article on how Stardog does it https://blog.stardog.com/how-to-read-stardog-query-plans/. Other RDF databases are probably broadly similar. Add a reference for the 'virtual graphs' approach? 5.3 "The working group intends to standardize better properties which allow the use of other CRSs" - does that refer to work on updating GeoSPARQL? not sure what we'll actually be able to achieve in this area.
Received on Monday, 13 March 2017 18:39:40 UTC