W3C home > Mailing lists > Public > public-sdw-wg@w3.org > March 2017

Re: Inclusion of non-geometric ways to describe location (e.g. address and geocode) in BP10?

From: Bill Roberts <bill@swirrl.com>
Date: Mon, 13 Mar 2017 09:29:13 +0000
Message-ID: <CAMTVsu=1hA=25VGoJC3Dbthtn6HcD7TkS4X+6J7ERmxtgJLFrg@mail.gmail.com>
To: Joshua Lieberman <jlieberman@tumblingwalls.com>
Cc: Jeremy Tandy <jeremy.tandy@gmail.com>, Andrea Perego <andrea.perego@ec.europa.eu>, Linda van den Brink <l.vandenbrink@geonovum.nl>, SDW WG Public List <public-sdw-wg@w3.org>
yes, not entirely sure what circumstances your colleague had in mind Jeremy
about the 'flat' stuff, but as Josh was saying, I think it might be a case
of 'it depends' - depends on various aspects of the shape of the data and
technology stack, so it's probably too involved a discussion to try to
summarise in a best practice.

I'm adding a comment on addresses, though would agree with Josh that when
possible it's best for data publishers to do geocoding and check that it's
right



On 11 March 2017 at 15:24, Joshua Lieberman <jlieberman@tumblingwalls.com>
wrote:

> “Flat” is not simple per se. If it is used to represent naturally
> hierarchical or graph-structured information it often becomes a mess. Not
> clear what this has to do with Elastic Search either, which is a full text
> search engine. Preference for “flat” usually has to do with using
> relational databases or tools supporting relational structures where
> anything other than uniform record lists becomes unwieldy. The Simple
> Features Profile has more to do with mitigating the flexibility of GML for
> client developers where that flexibility isn't really needed. It’s fine, I
> think, to point out anywhere in the BP doc that there is always a trade-off
> between choice and interoperability, but data structures should reflect
> what is being modeled. As long as some tools are available to work with
> them, e.g. graph stores, this will generally be less complicated to work
> with. It may also be an issue that more is being modeled than is needed for
> a particular application, e.g. single positions should not be the only
> locator considered, but are frequently good enough for many applications.
>
> Interesting question about addresses, because they are frequently the only
> locators available in datasets, but they are not reliable spatial
> positions. Geocoding ends up being an art that frequently sends people to
> the wrong place, and an even less exact art for addresses in unfamiliar
> systems, e.g. found on the Web. This may be another case where there is a
> practice (locating data records with addresses) that isn’t really a best
> practice. Wherever possible, data producers should do the geocoding
> themselves and get it right, rather than leaving others to rely on various
> geocoding services with uncertain positional reliability and restrictive
> terms of use.
>
> —Josh
>
> Joshua Lieberman
> Principal, Tumbling Walls
> jlieberman*tumblingwalls.com
> +1 617 431 6431 <(617)%20431-6431>
>
> On Mar 11, 2017, at 5:24 AM, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>
> Hah! Perhaps just identify that simple, flat structures are easier for
> users to work with, so only add complexity where you need it ... and
> reference GML Simple Features Profile as an example of how "complexity" can
> be managed?
>
> On Sat, 11 Mar 2017 at 10:21 Bill Roberts <bill@swirrl.com> wrote:
>
>> interesting - though I think that's going to be too detailed to get into
>> in the BP - unless you want BP10 to be 20 pages long!
>>
>>
>>
>> On 11 March 2017 at 10:05, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>>
>> Hi Bill - just one more thing (again!) ...
>>
>> I was talking to a colleague of mine earlier this week about how he's
>> publishing spatial data on the Web; making use of GeoJSON, elastic-search,
>> open layers etc. All good "modern" webby stuff. One of the bits of advice
>> he gave was:
>>
>> "keep your data structures FLAT (avoid nesting/embedded objects; as per
>> OGC GML Simple Features Profile) - this makes it easier for users to work
>> with in existing tools (e.g. ElasticSearch)"
>>
>> He refers to the structures in GeoJSON [1] "properties" object (see 3.2
>> Feature Object [2]) and (I would assume) any "foreign members" [3]. This
>> makes it easier to import the GeoJSON documents into elastic search etc. (I
>> think that's what he said)
>>
>> The OGC's GML Simple Features Profile [4] defines three levels of
>> compliance: SF-0, SF-1 and SF-2 - each of which become progressively less
>> restrictive profiles from 0 to 2. Above 2 you're using everything that GML
>> has; kitchen sink and all! I wonder if these notions of profiling for
>> interoperability might be a useful inclusion in BP10? section "2.1
>> Introduction" provides a good starting point (but then I suppose that's the
>> point).
>>
>> Jeremy
>>
>> [1]: https://tools.ietf.org/html/rfc7946
>> [2]: https://tools.ietf.org/html/rfc7946#section-3.2
>> [3]: https://tools.ietf.org/html/rfc7946#section-6.1
>> [4]: http://portal.opengeospatial.org/files/?artifact_id=42729
>>
>> On Sat, 11 Mar 2017 at 09:29 Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>>
>> Thanks Bill.
>>
>> On Sat, 11 Mar 2017 at 09:18 Bill Roberts <bill@swirrl.com> wrote:
>>
>> Hi Jeremy
>>
>> Good idea - I think it would be good to include something about addresses
>> and geocodes as a way of encoding location.  I'll try to incorporate
>> something on that.
>>
>>
>>
>> On 11 March 2017 at 09:08, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>>
>> Hi Bill.
>>
>> Given that Andrea is talking about _geometries_ in BP8, we seem to have a
>> gap with regard to _other_ mechanisms to describe location; e.g. addresses
>> and geocodes (postal codes etc., geohashes [1] and, I think worth
>> mentioning explicitly, W3W [2]).
>>
>> In you discussion of “how to encode spatial data” I think it is worth
>> calling these mechanisms out specifically, and referring to Andrea’s work
>> on geometries in BP8.
>>
>> Given Andrea's involvement with the ISA Programme Location Core
>> Vocabulary [3] (which defines locn:Address), he may have some useful
>> contributions here too.
>>
>>
>> Addresses are mentioned in the following use cases:
>>
>>    - 4.5 Harvesting of Local Search Content
>>    - 4.9 Enabling publication, discovery and analysis of spatiotemporal
>>    data in the humanities
>>    - 4.13 Publication of air quality data aggregations
>>
>>
>> Strangely, we don’t have any requirements that mention addresses.
>>
>> I’m also reminded of the Discrete Global Grid System (DGGS) standard
>> being prepared by OGC [4] which will … For example, HEALPix (“Hierarchical
>> Equal Area isoLatitude Pixelization”) grids, an indexing system used for
>> DGGS, are useful for EO data because each cell is uniquely identified and
>> has equal-area (at that level in the grid) so that you don’t need to
>> re-sample when comparing cell properties; the value of each cell is
>> directly comparable. DGGS and HEALPix are (were?) referenced in the EO-QB
>> work of our group.
>>
>> That said, I don’t think the DGGS is formally approved as a standard, so
>> it may only warrant a note - or no mention at all. I doubt it meets our
>> criteria for “best practice in the wild”. It also looks a little complex
>> from my quick scan of the OGC doc.
>>
>> There are also clearly a large number of other coding systems for
>> geographical and administrative areas & places. I’ll try to cover referring
>> to these types of things in BP14 concerning linking.
>>
>> Given the short amount of time available before our intended “freeze” (on
>> Wed 15-Mar) of the BP doc for next WD release, I’d be content to push these
>> changes into the work plan for the next sprint.
>>
>> Jeremy
>>
>>
>> [1]: https://en.wikipedia.org/wiki/Geohash
>> [2]: http://what3words.com
>> [3]: https://www.w3.org/ns/locn#
>>
>> [4]: public draft: OGC #15-104r3 https://portal.opengeospatial.
>> org/files/66643
>>
>>
>>
>>
>>
>>
>
Received on Monday, 13 March 2017 09:29:47 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 March 2017 09:29:48 UTC