- From: Bill Roberts <bill@swirrl.com>
- Date: Mon, 13 Mar 2017 09:29:13 +0000
- To: Joshua Lieberman <jlieberman@tumblingwalls.com>
- Cc: Jeremy Tandy <jeremy.tandy@gmail.com>, Andrea Perego <andrea.perego@ec.europa.eu>, Linda van den Brink <l.vandenbrink@geonovum.nl>, SDW WG Public List <public-sdw-wg@w3.org>
- Message-ID: <CAMTVsu=1hA=25VGoJC3Dbthtn6HcD7TkS4X+6J7ERmxtgJLFrg@mail.gmail.com>
yes, not entirely sure what circumstances your colleague had in mind Jeremy about the 'flat' stuff, but as Josh was saying, I think it might be a case of 'it depends' - depends on various aspects of the shape of the data and technology stack, so it's probably too involved a discussion to try to summarise in a best practice. I'm adding a comment on addresses, though would agree with Josh that when possible it's best for data publishers to do geocoding and check that it's right On 11 March 2017 at 15:24, Joshua Lieberman <jlieberman@tumblingwalls.com> wrote: > “Flat” is not simple per se. If it is used to represent naturally > hierarchical or graph-structured information it often becomes a mess. Not > clear what this has to do with Elastic Search either, which is a full text > search engine. Preference for “flat” usually has to do with using > relational databases or tools supporting relational structures where > anything other than uniform record lists becomes unwieldy. The Simple > Features Profile has more to do with mitigating the flexibility of GML for > client developers where that flexibility isn't really needed. It’s fine, I > think, to point out anywhere in the BP doc that there is always a trade-off > between choice and interoperability, but data structures should reflect > what is being modeled. As long as some tools are available to work with > them, e.g. graph stores, this will generally be less complicated to work > with. It may also be an issue that more is being modeled than is needed for > a particular application, e.g. single positions should not be the only > locator considered, but are frequently good enough for many applications. > > Interesting question about addresses, because they are frequently the only > locators available in datasets, but they are not reliable spatial > positions. Geocoding ends up being an art that frequently sends people to > the wrong place, and an even less exact art for addresses in unfamiliar > systems, e.g. found on the Web. This may be another case where there is a > practice (locating data records with addresses) that isn’t really a best > practice. Wherever possible, data producers should do the geocoding > themselves and get it right, rather than leaving others to rely on various > geocoding services with uncertain positional reliability and restrictive > terms of use. > > —Josh > > Joshua Lieberman > Principal, Tumbling Walls > jlieberman*tumblingwalls.com > +1 617 431 6431 <(617)%20431-6431> > > On Mar 11, 2017, at 5:24 AM, Jeremy Tandy <jeremy.tandy@gmail.com> wrote: > > Hah! Perhaps just identify that simple, flat structures are easier for > users to work with, so only add complexity where you need it ... and > reference GML Simple Features Profile as an example of how "complexity" can > be managed? > > On Sat, 11 Mar 2017 at 10:21 Bill Roberts <bill@swirrl.com> wrote: > >> interesting - though I think that's going to be too detailed to get into >> in the BP - unless you want BP10 to be 20 pages long! >> >> >> >> On 11 March 2017 at 10:05, Jeremy Tandy <jeremy.tandy@gmail.com> wrote: >> >> Hi Bill - just one more thing (again!) ... >> >> I was talking to a colleague of mine earlier this week about how he's >> publishing spatial data on the Web; making use of GeoJSON, elastic-search, >> open layers etc. All good "modern" webby stuff. One of the bits of advice >> he gave was: >> >> "keep your data structures FLAT (avoid nesting/embedded objects; as per >> OGC GML Simple Features Profile) - this makes it easier for users to work >> with in existing tools (e.g. ElasticSearch)" >> >> He refers to the structures in GeoJSON [1] "properties" object (see 3.2 >> Feature Object [2]) and (I would assume) any "foreign members" [3]. This >> makes it easier to import the GeoJSON documents into elastic search etc. (I >> think that's what he said) >> >> The OGC's GML Simple Features Profile [4] defines three levels of >> compliance: SF-0, SF-1 and SF-2 - each of which become progressively less >> restrictive profiles from 0 to 2. Above 2 you're using everything that GML >> has; kitchen sink and all! I wonder if these notions of profiling for >> interoperability might be a useful inclusion in BP10? section "2.1 >> Introduction" provides a good starting point (but then I suppose that's the >> point). >> >> Jeremy >> >> [1]: https://tools.ietf.org/html/rfc7946 >> [2]: https://tools.ietf.org/html/rfc7946#section-3.2 >> [3]: https://tools.ietf.org/html/rfc7946#section-6.1 >> [4]: http://portal.opengeospatial.org/files/?artifact_id=42729 >> >> On Sat, 11 Mar 2017 at 09:29 Jeremy Tandy <jeremy.tandy@gmail.com> wrote: >> >> Thanks Bill. >> >> On Sat, 11 Mar 2017 at 09:18 Bill Roberts <bill@swirrl.com> wrote: >> >> Hi Jeremy >> >> Good idea - I think it would be good to include something about addresses >> and geocodes as a way of encoding location. I'll try to incorporate >> something on that. >> >> >> >> On 11 March 2017 at 09:08, Jeremy Tandy <jeremy.tandy@gmail.com> wrote: >> >> Hi Bill. >> >> Given that Andrea is talking about _geometries_ in BP8, we seem to have a >> gap with regard to _other_ mechanisms to describe location; e.g. addresses >> and geocodes (postal codes etc., geohashes [1] and, I think worth >> mentioning explicitly, W3W [2]). >> >> In you discussion of “how to encode spatial data” I think it is worth >> calling these mechanisms out specifically, and referring to Andrea’s work >> on geometries in BP8. >> >> Given Andrea's involvement with the ISA Programme Location Core >> Vocabulary [3] (which defines locn:Address), he may have some useful >> contributions here too. >> >> >> Addresses are mentioned in the following use cases: >> >> - 4.5 Harvesting of Local Search Content >> - 4.9 Enabling publication, discovery and analysis of spatiotemporal >> data in the humanities >> - 4.13 Publication of air quality data aggregations >> >> >> Strangely, we don’t have any requirements that mention addresses. >> >> I’m also reminded of the Discrete Global Grid System (DGGS) standard >> being prepared by OGC [4] which will … For example, HEALPix (“Hierarchical >> Equal Area isoLatitude Pixelization”) grids, an indexing system used for >> DGGS, are useful for EO data because each cell is uniquely identified and >> has equal-area (at that level in the grid) so that you don’t need to >> re-sample when comparing cell properties; the value of each cell is >> directly comparable. DGGS and HEALPix are (were?) referenced in the EO-QB >> work of our group. >> >> That said, I don’t think the DGGS is formally approved as a standard, so >> it may only warrant a note - or no mention at all. I doubt it meets our >> criteria for “best practice in the wild”. It also looks a little complex >> from my quick scan of the OGC doc. >> >> There are also clearly a large number of other coding systems for >> geographical and administrative areas & places. I’ll try to cover referring >> to these types of things in BP14 concerning linking. >> >> Given the short amount of time available before our intended “freeze” (on >> Wed 15-Mar) of the BP doc for next WD release, I’d be content to push these >> changes into the work plan for the next sprint. >> >> Jeremy >> >> >> [1]: https://en.wikipedia.org/wiki/Geohash >> [2]: http://what3words.com >> [3]: https://www.w3.org/ns/locn# >> >> [4]: public draft: OGC #15-104r3 https://portal.opengeospatial. >> org/files/66643 >> >> >> >> >> >> >
Received on Monday, 13 March 2017 09:29:47 UTC