Re: Best Way to Extend the Geo Vocabulary to include an "error" or "extent" radius in meters

Hi Paul,

Thanks for bring up these relevant issues.

The geo vocabulary assumes the lat and long are in WGS84.

One of the problems that we were seeing are records that were georeferenced
to the center point of a Canadian Province at it was not clear if that was
actually where the species was observed. Was it at that specific point
or did someone simply tag apply geocode it later to the center of the
province later? Part of this relates to your
comment on provenance.

I had hoped on incorporating both the extent "the actual area that was
sampled" along with the error in the GPS reading to create one all
encompassing radius. Sometimes people will take a GPS reading and then
observe butterflies or plants that are not exactly at that point but within
50 to 100 meters.

I am hoping that the issues that you mention with New York etc would be in
part of a separate set of statements
using some standard vocabulary like geonames to indicate the county or state
the observation was made.

Thanks Again,

- Pete

On Mon, Oct 11, 2010 at 9:47 AM, Paul Houle <ontology2@gmail.com> wrote:

> On Thu, Oct 7, 2010 at 5:28 PM, Peter DeVries <pete.devries@gmail.com>wrote:
>
>> Hi LOD'ers,
>>
>> There was some discussion about ways to record species observations using
>> the geo vocabulary at a recent biodiversity informatics meeting.
>>
>> Some see the advantages of using the geo standard, but we really need to
>> have a way to incorporate and error or extent in meters.
>>
>>
>
>     For Ookaboo I've worked out an internal data model for points;  Ookaboo
> also knows about real shapes,  but the fact is that most people out there
> will throw points at you and only know how to consume points.
>
>     Here are a few bits of extra data that are useful to add to a point
>
> (1) provenance
> (2) datum (I try to stick to WGS84,  but points from freebase occasionally
> have a Datum attached,  so I store it)
> (3) circular error (the accuracy of the determination of the point,  for
> instance the technical limitation of a GPS receiver)
> (4) scale length of feature (how accurate do we have to be?  it's not worth
> getting into an edit war over the exact point that represents,  say,
> Finland.)
> (5) an overall quality rating (so if we've got ten points we can pick the
> best)
>
>     Note that (3) and (4) are weakly exclusive of each other.  If,  for
> instance,  I'm representing a GPS point that the camera was at when a photo
> was taken,  that's literally just a point,  and (4) is zero.  On the other
> hand,  (4) >> (3) if the point represents "New York City",  because you
> can't really fix the location of something that sized to more than a few km.
>
>     This distinction is really important in doing data cleaning work.  If
> several sources disagree about the location of NYC by a few km,  it's best
> to let them "agree to disagree" -- either pick the one you like the best
> (Wikipedia centers NYC at the intersection north of the Port Authority and
> the NY Times building,  sweet...) or you take the centroid of them.  No
> matter what you do,  you want to use a cheap heuristic and not spend
> expensive resources on this non-problem.
>
>     On the other hand,  if coordinates for the "Statue of Liberty" were off
> by a km from different sources,   that indicates that a real problem,  and
> some action ought to be taken.
>
>
>



-- 
----------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------

Received on Monday, 11 October 2010 16:52:41 UTC