Re: Decision required: BP 17 "How to work with crowd sourced observations" from Jeremy Tandy on 2016-09-02 (public-sdw-wg@w3.org from September 2016)

From: Jeremy Tandy <jeremy.tandy@gmail.com>
Date: Fri, 02 Sep 2016 14:32:50 +0000
To: Scott Simmons <ssimmons@opengeospatial.org>, Rob Atkinson <rob@metalinkage.com.au>
Cc: Joshua Lieberman <jlieberman@tumblingwalls.com>, SDW WG Public List <public-sdw-wg@w3.org>
Message-ID: <CADtUq_1RnbWFmcznXOZu=0dvyxMrS62SPSrMA9XM+B7Vwp_uNQ@mail.gmail.com>
Hi Rob-

maybe it's Friday, maybe it's because I'm trying to understand several
discussion threads in parallel ... but I don't understand your point (3).
Can you illustrate with an example or two?

Thanks, Jeremy

On Fri, 2 Sep 2016 at 14:54 Scott Simmons <ssimmons@opengeospatial.org>
wrote:

> Makes a lot of sense to me and, yes, this would be a good topic for
> further exploration in a Testbed.
>
> Scott
>
> On Sep 2, 2016, at 6:54 AM, Rob Atkinson <rob@metalinkage.com.au> wrote:
>
> Yep - I think this is well summarised - but we do need to lurch towards
> something useful to say.
>
> Less is more, but it does feel like the requirements are that the user can
> access (and recognise) relevant metadata - but that metadata may be handled
> at different levels of detail.
>
> Therefore i would summarise the relevant sub-requirements as
> 1) for a given piece of data that is part of a larger collection, metadata
> may be provided at either the data record or the data set level, and a link
> to the dataset metadata should be provided from data records.
> 2) consideration should be given to using the observations and measurement
> model for handling relevant aspects relating to value assignment
> 3) the relationship between the formalisms used to describe spatial
> attributes of data in different parts of the system (dataset, service,
> feature and property) must be explicit and accessible to the user, and this
> is best facilitated by using the same formalisms in each case.
>
> I think there is perhaps one level of indirection for metadata i missed
> though - and that is indirect through the value assignment process - i.e.
> implicit in social media data. I still think this is perhaps best expressed
> as dataset metadata.
>
> I notice the Powder spec neatly supports third party provision of
> metadata. This would be a great testbed topic for OGC :-)
>
> rob
>
>
>
>
> On Fri, 2 Sep 2016 at 21:24 Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>
>> So I think you're saying that the aspect which is 'special' for crowd-sourced
>> spatial data is data quality? ... and that the quality can be characterised
>> through "observation" metadata (ref. "humans as sensors").
>>
>> Looking at OGC #15-057r2 "Testbed-11 Incorporating Social Media in
>> Emergency Response Engineering Report", it seems that the pertinent
>> metadata being captured for each social media entry (?) is username (and by
>> inference, the human originator) and time.
>>
>> [ I guess that for "time" it's possible in some cases to capture both
>> phenomenon time and result time; e.g. a digital photo may include the
>> datestamp when the photo was taken within the EXIF metadata which would be
>> the phenomenon time, while result time would be when it was uploaded to the
>> social media platform ]
>>
>> This seems to be related to "value assignment", as described in ISO
>> 19109:2015 §7.4.10 ...
>>
>> "The feature type OM_Observation (ISO 19156:2011), and the classes
>> LI_ProcessStep (ISO 19115-1:2014) and MI_Event (ISO 19115-2:2009) may be
>> interpreted as instances of ValueAssignment (Figure 6). The description of
>> a specific observation can provide an evaluation of the likely error in a
>> property value."
>>
>> ISO 19109:2015 also notes that other examples of value assignment are:
>> assertion (in which a property value is assigned by a competent authority),
>> inheritance (from a parent feature in which a property value is
>> duplicated from a well defined source) and derivation (in which a
>> property value is derived from a number of input parameters based on a
>> defined ruleset).
>>
>> Is this another example where we are interested in the metadata about a
>> particular spatial property value? This seems to be related to
>> @robatkinson's concern about "requirements for units of measure, precision
>> and accuracy" [1] where he states:
>>
>> IMHO there is a _requirement_ for a canonical mechanism to state aspects of
>> spatial properties.
>>
>> I also think its a requirement for that mechanism to be common with broader
>> requirements for publishing metadata.  Its not a uniquely spatial concern -
>> but spatial is typified by this being relevant at dataset, service, feature
>> and property levels of granularity and the need for users to be able to
>> discover and interpret this information regardless of where it is best
>> placed.
>>
>> In the case of crowd-sourced spatial data, the likely error in a
>> property value is one of the "aspects of spatial properties" ... and
>> expressing the observation metadata ("human as sensor") is one way to
>> capture this.
>>
>> BTW, does this only apply to social media applications - ref. OGC #15-057r2
>> "Testbed-11 Incorporating Social Media in Emergency Response Engineering
>> Report" - or does it also apply in for volunteer geographic information
>> such as Open Street Map. In this case OSM appears to retain metadata about
>> who made the change, when the change was made, which changeset it was part
>> of ... the 'value assignment' metadata is present, but it's not in the form
>> of an "observation".
>>
>> Thinking more broadly, is the best practice about capturing the "value
>> assignment" metadata for spatial data - depending on the granularity of the
>> data, this might be expressed at the dataset, feature or property-value
>> level. Use of the Observation (meta) model is one example of how to achieve
>> this?
>>
>> All that said, I'm left wondering about how this might be incorporated
>> into the BP document ... is it a data quality issue (for §10.2 "Spatial
>> Data Quality") or is it a metadata issue (for §10.1 "Spatial Metadata")?
>>
>> Am I making sense?
>>
>> Jeremy
>>
>> [1]: https://lists.w3.org/Archives/Public/public-sdw-wg/2016Aug/0251.html
>>
>>
>> On Wed, 31 Aug 2016 at 14:25 Joshua Lieberman <
>> jlieberman@tumblingwalls.com> wrote:
>>
>>> Jeremy,
>>>
>>> The experience of several OGC activities has been that it is valuable
>>> and perhaps not widely enough recognized to treat crowd-sourced “spatial
>>> data” as observations. This allows the input to be utilized, but recognizes
>>> there may be ambiguities in the the sensor capabilities and measurement
>>> quality, as well as just what the features of interest and sampling
>>> features really are.
>>>
>>> A case could be made for a best practice that recommends this, even if a
>>> validation procedure subsequently builds feature data from those social
>>> observations.
>>>
>>> Josh
>>>
>>> On Aug 31, 2016, at 8:59 AM, Jeremy Tandy <jeremy.tandy@gmail.com>
>>> wrote:
>>>
>>> Hi -
>>>
>>> BP 17 "How to work with crowd sourced observations" [1] is an 'orphaned'
>>> best practice that is not well aligned with DWBP.
>>>
>>> In the BP sub-group call on 24-Aug (minutes [2]), we were unable to
>>> decide whether this candidate best practice is appropriate.
>>>
>>> Concerns were raised that there is not anything inherently "spatial"
>>> about crowd sourced data - and, moreover, that introducing crowd sourcing
>>> as a topic is a "can of worms" (particularly relating to governance of that
>>> information etc.).
>>>
>>> The feeling on the call was that we should remove the BP and use crowd
>>> sourced observations as an example of spatial data within the Nieuwhaven
>>> flooding scenario.
>>>
>>> I took an action to consider if there was anything "special" that we
>>> needed to capture ...
>>>
>>> Having thought about this further, I conclude that the governance
>>> arrangements associated with Volunteer Geographic Information (VGI) and
>>> other crowd sourced spatial data are out of scope. However, we should
>>> examine how aggregators of such information, such as social media
>>> platforms, may choose to expose this data on the web.
>>>
>>> In particular, users of social media do not routinely use URIs to
>>> identify spatial things; relying instead on addresses or geocodes (e.g.
>>> What3Words).
>>>
>>> Address, geocode, geographic position (e.g. latitude and longitude) must
>>> all be considered attributes of some spatial thing. If, for example, an
>>> address is provided without an explicitly relationship to a spatial thing
>>> then we must infer that a spatial thing exists, and the (social media)
>>> platform provider should mint an identifier for it and relate it to the
>>> address.
>>>
>>> Such inferred spatial things may be reconciled with other spatial things
>>> if one can be sure that they are indeed the same; either immediately by the
>>> data curator or later by any sufficiently knowledgeable party. However,
>>> such reconciliation is complex ( a long time ago, @eparsons said "there be
>>> dragons").
>>>
>>> We may even be able to help individuals using social media platforms
>>> improve their spatial data by select a particular spatial thing … e.g.
>>> twitter provides choice of location based on geocoding and sometimes a
>>> specific spatial thing (based on foursquare).
>>>
>>> So ...
>>>
>>> PROPOSAL: we remove BP 17, cover spatial data in social media as an
>>> example and treat social media / crowd source data platform providers as a
>>> source of spatial data on the web that should follow our BPs. (more or
>>> less).
>>>
>>> Voting:
>>>
>>> +1
>>>
>>>
>>> Your thoughts please.
>>>
>>> Jeremy
>>>
>>>
>>>
>>> [1]: http://w3c.github.io/sdw/bp/#crowd-obs
>>> [2]: http://www.w3.org/2016/08/24-sdwbp-minutes.html
>>>
>>>
>>>
>>>
>
Received on Friday, 2 September 2016 14:33:30 UTC