Re: Inclusion of non-geometric ways to describe location (e.g. address and geocode) in BP10?

yes, not entirely sure what circumstances your colleague had in mind Jeremy
about the 'flat' stuff, but as Josh was saying, I think it might be a case
of 'it depends' - depends on various aspects of the shape of the data and
technology stack, so it's probably too involved a discussion to try to
summarise in a best practice.

I'm adding a comment on addresses, though would agree with Josh that when
possible it's best for data publishers to do geocoding and check that it's
right



On 11 March 2017 at 15:24, Joshua Lieberman <jlieberman@tumblingwalls.com>
wrote:

> “Flat” is not simple per se. If it is used to represent naturally
> hierarchical or graph-structured information it often becomes a mess. Not
> clear what this has to do with Elastic Search either, which is a full text
> search engine. Preference for “flat” usually has to do with using
> relational databases or tools supporting relational structures where
> anything other than uniform record lists becomes unwieldy. The Simple
> Features Profile has more to do with mitigating the flexibility of GML for
> client developers where that flexibility isn't really needed. It’s fine, I
> think, to point out anywhere in the BP doc that there is always a trade-off
> between choice and interoperability, but data structures should reflect
> what is being modeled. As long as some tools are available to work with
> them, e.g. graph stores, this will generally be less complicated to work
> with. It may also be an issue that more is being modeled than is needed for
> a particular application, e.g. single positions should not be the only
> locator considered, but are frequently good enough for many applications.
>
> Interesting question about addresses, because they are frequently the only
> locators available in datasets, but they are not reliable spatial
> positions. Geocoding ends up being an art that frequently sends people to
> the wrong place, and an even less exact art for addresses in unfamiliar
> systems, e.g. found on the Web. This may be another case where there is a
> practice (locating data records with addresses) that isn’t really a best
> practice. Wherever possible, data producers should do the geocoding
> themselves and get it right, rather than leaving others to rely on various
> geocoding services with uncertain positional reliability and restrictive
> terms of use.
>
> —Josh
>
> Joshua Lieberman
> Principal, Tumbling Walls
> jlieberman*tumblingwalls.com
> +1 617 431 6431 <(617)%20431-6431>
>
> On Mar 11, 2017, at 5:24 AM, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>
> Hah! Perhaps just identify that simple, flat structures are easier for
> users to work with, so only add complexity where you need it ... and
> reference GML Simple Features Profile as an example of how "complexity" can
> be managed?
>
> On Sat, 11 Mar 2017 at 10:21 Bill Roberts <bill@swirrl.com> wrote:
>
>> interesting - though I think that's going to be too detailed to get into
>> in the BP - unless you want BP10 to be 20 pages long!
>>
>>
>>
>> On 11 March 2017 at 10:05, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>>
>> Hi Bill - just one more thing (again!) ...
>>
>> I was talking to a colleague of mine earlier this week about how he's
>> publishing spatial data on the Web; making use of GeoJSON, elastic-search,
>> open layers etc. All good "modern" webby stuff. One of the bits of advice
>> he gave was:
>>
>> "keep your data structures FLAT (avoid nesting/embedded objects; as per
>> OGC GML Simple Features Profile) - this makes it easier for users to work
>> with in existing tools (e.g. ElasticSearch)"
>>
>> He refers to the structures in GeoJSON [1] "properties" object (see 3.2
>> Feature Object [2]) and (I would assume) any "foreign members" [3]. This
>> makes it easier to import the GeoJSON documents into elastic search etc. (I
>> think that's what he said)
>>
>> The OGC's GML Simple Features Profile [4] defines three levels of
>> compliance: SF-0, SF-1 and SF-2 - each of which become progressively less
>> restrictive profiles from 0 to 2. Above 2 you're using everything that GML
>> has; kitchen sink and all! I wonder if these notions of profiling for
>> interoperability might be a useful inclusion in BP10? section "2.1
>> Introduction" provides a good starting point (but then I suppose that's the
>> point).
>>
>> Jeremy
>>
>> [1]: https://tools.ietf.org/html/rfc7946
>> [2]: https://tools.ietf.org/html/rfc7946#section-3.2
>> [3]: https://tools.ietf.org/html/rfc7946#section-6.1
>> [4]: http://portal.opengeospatial.org/files/?artifact_id=42729
>>
>> On Sat, 11 Mar 2017 at 09:29 Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>>
>> Thanks Bill.
>>
>> On Sat, 11 Mar 2017 at 09:18 Bill Roberts <bill@swirrl.com> wrote:
>>
>> Hi Jeremy
>>
>> Good idea - I think it would be good to include something about addresses
>> and geocodes as a way of encoding location.  I'll try to incorporate
>> something on that.
>>
>>
>>
>> On 11 March 2017 at 09:08, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>>
>> Hi Bill.
>>
>> Given that Andrea is talking about _geometries_ in BP8, we seem to have a
>> gap with regard to _other_ mechanisms to describe location; e.g. addresses
>> and geocodes (postal codes etc., geohashes [1] and, I think worth
>> mentioning explicitly, W3W [2]).
>>
>> In you discussion of “how to encode spatial data” I think it is worth
>> calling these mechanisms out specifically, and referring to Andrea’s work
>> on geometries in BP8.
>>
>> Given Andrea's involvement with the ISA Programme Location Core
>> Vocabulary [3] (which defines locn:Address), he may have some useful
>> contributions here too.
>>
>>
>> Addresses are mentioned in the following use cases:
>>
>>    - 4.5 Harvesting of Local Search Content
>>    - 4.9 Enabling publication, discovery and analysis of spatiotemporal
>>    data in the humanities
>>    - 4.13 Publication of air quality data aggregations
>>
>>
>> Strangely, we don’t have any requirements that mention addresses.
>>
>> I’m also reminded of the Discrete Global Grid System (DGGS) standard
>> being prepared by OGC [4] which will … For example, HEALPix (“Hierarchical
>> Equal Area isoLatitude Pixelization”) grids, an indexing system used for
>> DGGS, are useful for EO data because each cell is uniquely identified and
>> has equal-area (at that level in the grid) so that you don’t need to
>> re-sample when comparing cell properties; the value of each cell is
>> directly comparable. DGGS and HEALPix are (were?) referenced in the EO-QB
>> work of our group.
>>
>> That said, I don’t think the DGGS is formally approved as a standard, so
>> it may only warrant a note - or no mention at all. I doubt it meets our
>> criteria for “best practice in the wild”. It also looks a little complex
>> from my quick scan of the OGC doc.
>>
>> There are also clearly a large number of other coding systems for
>> geographical and administrative areas & places. I’ll try to cover referring
>> to these types of things in BP14 concerning linking.
>>
>> Given the short amount of time available before our intended “freeze” (on
>> Wed 15-Mar) of the BP doc for next WD release, I’d be content to push these
>> changes into the work plan for the next sprint.
>>
>> Jeremy
>>
>>
>> [1]: https://en.wikipedia.org/wiki/Geohash
>> [2]: http://what3words.com
>> [3]: https://www.w3.org/ns/locn#
>>
>> [4]: public draft: OGC #15-104r3 https://portal.opengeospatial.
>> org/files/66643
>>
>>
>>
>>
>>
>>
>

Received on Monday, 13 March 2017 09:29:47 UTC