Re: Decision required: BP 17 "How to work with crowd sourced observations" from Scott Simmons on 2016-09-02 (public-sdw-wg@w3.org from September 2016)

From: Scott Simmons <ssimmons@opengeospatial.org>
Date: Fri, 2 Sep 2016 07:54:36 -0600
To: Rob Atkinson <rob@metalinkage.com.au>
Cc: Jeremy Tandy <jeremy.tandy@gmail.com>, Joshua Lieberman <jlieberman@tumblingwalls.com>, SDW WG Public List <public-sdw-wg@w3.org>
Message-Id: <F92AD35C-064E-431C-978C-1526C9DC4CE7@opengeospatial.org>
Makes a lot of sense to me and, yes, this would be a good topic for further exploration in a Testbed.

Scott

> On Sep 2, 2016, at 6:54 AM, Rob Atkinson <rob@metalinkage.com.au> wrote:
> 
> Yep - I think this is well summarised - but we do need to lurch towards something useful to say. 
> 
> Less is more, but it does feel like the requirements are that the user can access (and recognise) relevant metadata - but that metadata may be handled at different levels of detail.
> 
> Therefore i would summarise the relevant sub-requirements as
> 1) for a given piece of data that is part of a larger collection, metadata may be provided at either the data record or the data set level, and a link to the dataset metadata should be provided from data records.
> 2) consideration should be given to using the observations and measurement model for handling relevant aspects relating to value assignment
> 3) the relationship between the formalisms used to describe spatial attributes of data in different parts of the system (dataset, service, feature and property) must be explicit and accessible to the user, and this is best facilitated by using the same formalisms in each case.
> 
> I think there is perhaps one level of indirection for metadata i missed though - and that is indirect through the value assignment process - i.e. implicit in social media data. I still think this is perhaps best expressed as dataset metadata.
> 
> I notice the Powder spec neatly supports third party provision of metadata. This would be a great testbed topic for OGC :-)
> 
> rob
> 
> 
> 
> 
> On Fri, 2 Sep 2016 at 21:24 Jeremy Tandy <jeremy.tandy@gmail.com <mailto:jeremy.tandy@gmail.com>> wrote:
> So I think you're saying that the aspect which is 'special' for crowd-sourced spatial data is data quality? ... and that the quality can be characterised through "observation" metadata (ref. "humans as sensors").
> 
> Looking at OGC #15-057r2 "Testbed-11 Incorporating Social Media in Emergency Response Engineering Report", it seems that the pertinent metadata being captured for each social media entry (?) is username (and by inference, the human originator) and time.
> 
> [ I guess that for "time" it's possible in some cases to capture both phenomenon time and result time; e.g. a digital photo may include the datestamp when the photo was taken within the EXIF metadata which would be the phenomenon time, while result time would be when it was uploaded to the social media platform ]
> 
> This seems to be related to "value assignment", as described in ISO 19109:2015 §7.4.10 ... 
> 
> "The feature type OM_Observation (ISO 19156:2011), and the classes LI_ProcessStep (ISO 19115-1:2014) and MI_Event (ISO 19115-2:2009) may be interpreted as instances of ValueAssignment (Figure 6). The description of a specific observation can provide an evaluation of the likely error in a property value."
> 
> ISO 19109:2015 also notes that other examples of value assignment are: assertion (in which a property value is assigned by a competent authority), inheritance (from a parent feature in which a property value is duplicated from a well defined source) and derivation (in which a property value is derived from a number of input parameters based on a defined ruleset).
> 
> Is this another example where we are interested in the metadata about a particular spatial property value? This seems to be related to @robatkinson's concern about "requirements for units of measure, precision and accuracy" [1] where he states:
> IMHO there is a _requirement_ for a canonical mechanism to state aspects of
> spatial properties.
> I also think its a requirement for that mechanism to be common with broader
> requirements for publishing metadata.  Its not a uniquely spatial concern -
> but spatial is typified by this being relevant at dataset, service, feature
> and property levels of granularity and the need for users to be able to
> discover and interpret this information regardless of where it is best
> placed.
> 
> In the case of crowd-sourced spatial data, the likely error in a property value is one of the "aspects of spatial properties" ... and expressing the observation metadata ("human as sensor") is one way to capture this.
> 
> BTW, does this only apply to social media applications - ref. OGC #15-057r2 "Testbed-11 Incorporating Social Media in Emergency Response Engineering Report" - or does it also apply in for volunteer geographic information such as Open Street Map. In this case OSM appears to retain metadata about who made the change, when the change was made, which changeset it was part of ... the 'value assignment' metadata is present, but it's not in the form of an "observation".
> 
> Thinking more broadly, is the best practice about capturing the "value assignment" metadata for spatial data - depending on the granularity of the data, this might be expressed at the dataset, feature or property-value level. Use of the Observation (meta) model is one example of how to achieve this?
> 
> All that said, I'm left wondering about how this might be incorporated into the BP document ... is it a data quality issue (for §10.2 "Spatial Data Quality") or is it a metadata issue (for §10.1 "Spatial Metadata")?
> 
> Am I making sense?
> 
> Jeremy
> 
> [1]: https://lists.w3.org/Archives/Public/public-sdw-wg/2016Aug/0251.html <https://lists.w3.org/Archives/Public/public-sdw-wg/2016Aug/0251.html> 
> 
> On Wed, 31 Aug 2016 at 14:25 Joshua Lieberman <jlieberman@tumblingwalls.com <mailto:jlieberman@tumblingwalls.com>> wrote:
> Jeremy,
> 
> The experience of several OGC activities has been that it is valuable and perhaps not widely enough recognized to treat crowd-sourced “spatial data” as observations. This allows the input to be utilized, but recognizes there may be ambiguities in the the sensor capabilities and measurement quality, as well as just what the features of interest and sampling features really are.
> 
> A case could be made for a best practice that recommends this, even if a validation procedure subsequently builds feature data from those social observations.
> 
> Josh
> 
>> On Aug 31, 2016, at 8:59 AM, Jeremy Tandy <jeremy.tandy@gmail.com <mailto:jeremy.tandy@gmail.com>> wrote:
>> 
>> Hi -
>> 
>> BP 17 "How to work with crowd sourced observations" [1] is an 'orphaned' best practice that is not well aligned with DWBP.
>> 
>> In the BP sub-group call on 24-Aug (minutes [2]), we were unable to decide whether this candidate best practice is appropriate.
>> 
>> Concerns were raised that there is not anything inherently "spatial" about crowd sourced data - and, moreover, that introducing crowd sourcing as a topic is a "can of worms" (particularly relating to governance of that information etc.).
>> 
>> The feeling on the call was that we should remove the BP and use crowd sourced observations as an example of spatial data within the Nieuwhaven flooding scenario.
>> 
>> I took an action to consider if there was anything "special" that we needed to capture ... 
>> 
>> Having thought about this further, I conclude that the governance arrangements associated with Volunteer Geographic Information (VGI) and other crowd sourced spatial data are out of scope. However, we should examine how aggregators of such information, such as social media platforms, may choose to expose this data on the web.
>> 
>> In particular, users of social media do not routinely use URIs to identify spatial things; relying instead on addresses or geocodes (e.g. What3Words).
>> Address, geocode, geographic position (e.g. latitude and longitude) must all be considered attributes of some spatial thing. If, for example, an address is provided without an explicitly relationship to a spatial thing then we must infer that a spatial thing exists, and the (social media) platform provider should mint an identifier for it and relate it to the address. 
>> 
>> Such inferred spatial things may be reconciled with other spatial things if one can be sure that they are indeed the same; either immediately by the data curator or later by any sufficiently knowledgeable party. However, such reconciliation is complex ( a long time ago, @eparsons said "there be dragons").
>> 
>> We may even be able to help individuals using social media platforms improve their spatial data by select a particular spatial thing … e.g. twitter provides choice of location based on geocoding and sometimes a specific spatial thing (based on foursquare).
>> 
>> So ... 
>> 
>> PROPOSAL: we remove BP 17, cover spatial data in social media as an example and treat social media / crowd source data platform providers as a source of spatial data on the web that should follow our BPs. (more or less).
>> 
>> Voting:
>> 
>> +1
>> 
>> 
>> 
>> Your thoughts please.
>> 
>> Jeremy
>> 
>> 
>> 
>> 
>> [1]: http://w3c.github.io/sdw/bp/#crowd-obs <http://w3c.github.io/sdw/bp/#crowd-obs> 
>> [2]: http://www.w3.org/2016/08/24-sdwbp-minutes.html <http://www.w3.org/2016/08/24-sdwbp-minutes.html> 
>> 
>
Received on Friday, 2 September 2016 13:55:05 UTC