Re: Decision required: BP 17 "How to work with crowd sourced observations"

From: Jeremy Tandy <jeremy.tandy@gmail.com>
Date: Fri, 02 Sep 2016 11:24:59 +0000
Message-ID: <CADtUq_3+c65+CiEdvd6cs5Tzy3berc04u=SG7HM7WZYqbsB8UA@mail.gmail.com>
To: Joshua Lieberman <jlieberman@tumblingwalls.com>
Cc: SDW WG Public List <public-sdw-wg@w3.org>
So I think you're saying that the aspect which is 'special' for crowd-sourced
spatial data is data quality? ... and that the quality can be characterised
through "observation" metadata (ref. "humans as sensors").

Looking at OGC #15-057r2 "Testbed-11 Incorporating Social Media in
Emergency Response Engineering Report", it seems that the pertinent
metadata being captured for each social media entry (?) is username (and by
inference, the human originator) and time.

[ I guess that for "time" it's possible in some cases to capture both
phenomenon time and result time; e.g. a digital photo may include the
datestamp when the photo was taken within the EXIF metadata which would be
the phenomenon time, while result time would be when it was uploaded to the
social media platform ]

This seems to be related to "value assignment", as described in ISO
19109:2015 §7.4.10 ...

"The feature type OM_Observation (ISO 19156:2011), and the classes
LI_ProcessStep (ISO 19115-1:2014) and MI_Event (ISO 19115-2:2009) may be
interpreted as instances of ValueAssignment (Figure 6). The description of
a specific observation can provide an evaluation of the likely error in a
property value."

ISO 19109:2015 also notes that other examples of value assignment are:
assertion (in which a property value is assigned by a competent authority),
inheritance (from a parent feature in which a property value is duplicated
from a well defined source) and derivation (in which a property value is
derived from a number of input parameters based on a defined ruleset).

Is this another example where we are interested in the metadata about a
particular spatial property value? This seems to be related to
@robatkinson's concern about "requirements for units of measure, precision
and accuracy" [1] where he states:

IMHO there is a _requirement_ for a canonical mechanism to state aspects of
spatial properties.

I also think its a requirement for that mechanism to be common with broader
requirements for publishing metadata.  Its not a uniquely spatial concern -
but spatial is typified by this being relevant at dataset, service, feature
and property levels of granularity and the need for users to be able to
discover and interpret this information regardless of where it is best

In the case of crowd-sourced spatial data, the likely error in a property
value is one of the "aspects of spatial properties" ... and expressing the
observation metadata ("human as sensor") is one way to capture this.

BTW, does this only apply to social media applications - ref. OGC #15-057r2
"Testbed-11 Incorporating Social Media in Emergency Response Engineering
Report" - or does it also apply in for volunteer geographic information
such as Open Street Map. In this case OSM appears to retain metadata about
who made the change, when the change was made, which changeset it was part
of ... the 'value assignment' metadata is present, but it's not in the form
of an "observation".

Thinking more broadly, is the best practice about capturing the "value
assignment" metadata for spatial data - depending on the granularity of the
data, this might be expressed at the dataset, feature or property-value
level. Use of the Observation (meta) model is one example of how to achieve

All that said, I'm left wondering about how this might be incorporated into
the BP document ... is it a data quality issue (for §10.2 "Spatial Data
Quality") or is it a metadata issue (for §10.1 "Spatial Metadata")?

Am I making sense?


[1]: https://lists.w3.org/Archives/Public/public-sdw-wg/2016Aug/0251.html

On Wed, 31 Aug 2016 at 14:25 Joshua Lieberman <jlieberman@tumblingwalls.com>

> Jeremy,
> The experience of several OGC activities has been that it is valuable and
> perhaps not widely enough recognized to treat crowd-sourced “spatial data”
> as observations. This allows the input to be utilized, but recognizes there
> may be ambiguities in the the sensor capabilities and measurement quality,
> as well as just what the features of interest and sampling features really
> are.
> A case could be made for a best practice that recommends this, even if a
> validation procedure subsequently builds feature data from those social
> observations.
> Josh
> On Aug 31, 2016, at 8:59 AM, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
> Hi -
> BP 17 "How to work with crowd sourced observations" [1] is an 'orphaned'
> best practice that is not well aligned with DWBP.
> In the BP sub-group call on 24-Aug (minutes [2]), we were unable to decide
> whether this candidate best practice is appropriate.
> Concerns were raised that there is not anything inherently "spatial" about
> crowd sourced data - and, moreover, that introducing crowd sourcing as a
> topic is a "can of worms" (particularly relating to governance of that
> information etc.).
> The feeling on the call was that we should remove the BP and use crowd
> sourced observations as an example of spatial data within the Nieuwhaven
> flooding scenario.
> I took an action to consider if there was anything "special" that we
> needed to capture ...
> Having thought about this further, I conclude that the governance
> arrangements associated with Volunteer Geographic Information (VGI) and
> other crowd sourced spatial data are out of scope. However, we should
> examine how aggregators of such information, such as social media
> platforms, may choose to expose this data on the web.
> In particular, users of social media do not routinely use URIs to identify
> spatial things; relying instead on addresses or geocodes (e.g. What3Words).
> Address, geocode, geographic position (e.g. latitude and longitude) must
> all be considered attributes of some spatial thing. If, for example, an
> address is provided without an explicitly relationship to a spatial thing
> then we must infer that a spatial thing exists, and the (social media)
> platform provider should mint an identifier for it and relate it to the
> address.
> Such inferred spatial things may be reconciled with other spatial things
> if one can be sure that they are indeed the same; either immediately by the
> data curator or later by any sufficiently knowledgeable party. However,
> such reconciliation is complex ( a long time ago, @eparsons said "there be
> dragons").
> We may even be able to help individuals using social media platforms
> improve their spatial data by select a particular spatial thing … e.g.
> twitter provides choice of location based on geocoding and sometimes a
> specific spatial thing (based on foursquare).
> So ...
> PROPOSAL: we remove BP 17, cover spatial data in social media as an
> example and treat social media / crowd source data platform providers as a
> source of spatial data on the web that should follow our BPs. (more or
> less).
> Voting:
> +1
> Your thoughts please.
> Jeremy
> [1]: http://w3c.github.io/sdw/bp/#crowd-obs
> [2]: http://www.w3.org/2016/08/24-sdwbp-minutes.html
Received on Friday, 2 September 2016 11:25:41 UTC

