W3C home > Mailing lists > Public > public-sdw-wg@w3.org > September 2016

Re: Clarification required: BP6 "use HTTP URIs for spatial things"

From: Jeremy Tandy <jeremy.tandy@gmail.com>
Date: Thu, 01 Sep 2016 22:24:58 +0000
Message-ID: <CADtUq_03+L2GBQ0pCY_w=KLgvx2Fu-P9Aftf2LHE1N509WTxKA@mail.gmail.com>
To: Rob Atkinson <rob@metalinkage.com.au>, janowicz@ucsb.edu, Joshua Lieberman <jlieberman@tumblingwalls.com>
Cc: Frans Knibbe <frans.knibbe@geodan.nl>, SDW WG Public List <public-sdw-wg@w3.org>
@roba:

> Circling in on a resolution here I hope

Feels that way to me.

> we need to make a very clear statement that geo "Features" are things
that have comparable URIs - but instances of "FeatureTypes" are
representations that are related

So, put another way, "instances of FeatureTypes" are [like] graphs of
information about a given Feature?

> Do we recommend [use of <foaf:isPrimaryTopicOf>], or leave it free

This is just one of many properties that might be used. Another is the
describedby Link Relation defined by POWDER-DR [1]. Given that POWDER-DR is
a W3C REC, this would give it the edge for me ... and it's available for
use beyond the realms of RDF given its inclusion in the IANA Link Relations
registry [2].

There may be others that the working group prefer.

All that said, I see the majority of folks being happy to work with the
(indirect) identifiers for their spatial things / features without
concerning themselves with identified representations.

Jeremy

[1]: https://www.w3.org/TR/powder-dr/#appD
[2]: http://www.iana.org/assignments/link-relations/link-relations.xhtml

On Thu, 1 Sep 2016 at 23:06 Rob Atkinson <rob@metalinkage.com.au> wrote:

>
> Circling in on a resolution here I hope :-)
>
> from what I am seeing -
>
> we need to make a very clear statement that geo "Features" are things that
> have comparable URIs - but instances of "FeatureTypes" are representations
> that are related. These representations are properties of the Feature that
> may be combined using  using owl:sameAs between the Features, but not the
> representation (FeatureType) instances
>
> There is practice 'in the wild" to use foaf as a vocabulary for the
> relationship.  Do we recommend this, or leave it free.  Do we specify that
> whatever relationship is used is a subProperty of foaf:primaryTopicOf  ?
>
> And finally, there is probably no established best practice for providing
> discovery of available bindings - and we should flag this as something that
> should be addressed - a missing BP against requirements
>
> There is evidence its at least feasible conforming to the vocabuary reuse
> BP -   for example a graph based mainly on VoiD can be made available as
> an extra representation using the IANA "alternates" relationshp  c.f. in
> the SIRF project  (
> http://environment.data.gov.au/water/id/catchment/100862?_view=alternates&_format=html -
> notwithstanding that the resources are woefully maintained now :-( Very sad
> as there was even a link checker that exploited this view available! )
>
> Rob
>
> On Fri, 2 Sep 2016 at 07:42 Krzysztof Janowicz <janowicz@ucsb.edu> wrote:
>
>>
>> Hi,
>>
>>
>> So as representations, these are not “owl:sameAs”.
>>
>>
>>
>> Just for clarification. owl:sameAs is only concerned with the mapping of
>> IRIs to (real world) entities and not 'representations' (leaving aside the
>> fact that everything is a representation in some sense). I.e., it is about
>> 'identity'. To give an extreme example, a URI may refer to the Eddystone
>> Lighthouse which may be classified as /Lighthouse/ in some repository.
>> Another URI established 50 years from now can still refer to this
>> particular (4th) lighthouse and classify it as a /Ruin/. Another 50 years
>> into the future, there may be yet another URI that refers to the fact that
>> at some stage there was a ruin here of the 4th lighthouse called Eddystone
>> while there is nothing physical left of it, and, thus, it is neither
>> classified as /Ruin/ nor /Lighthouse/. In fact, we do not even need to
>> introduce the concept of "real world" here as we can also establish a
>> sameAs relation between two URIs that point to Zeus. Please note that this
>> is different from establish a sameAs link between a particular statue of
>> Zeus in a particular museum and Zeus as the god of thunder. Finally, the
>> purpose of establishing sameAs links is typically data fusion/conflation
>> (no matter whether this is done ad-hoc, manually, or (offline)
>> computationally) .
>>
>> Best,
>> Jano
>>
>>
>>
>> On 08/31/2016 06:38 AM, Joshua Lieberman wrote:
>>
>> Jeremy,
>>
>> So as representations, these are not “owl:sameAs”. We assume that as
>> feature data, each refers to a real world entity, but we don’t assert that
>> this VerticalObstruction is the same individual as this
>> MaritimeNavigationAid. We just are suspecting or asserting that the same
>> real world thing is being discerned in two different ways. Someone may
>> define a lighthouse class as subclassing both, otherwise a slightly
>> specialized relation (e.g. sdwgeo:sameRealWorldEntityAs) would be useful
>> here.
>>
>> Josh
>>
>> On Aug 31, 2016, at 8:41 AM, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>>
>> > That still leaves a gap in expressing whether two feature data
>> entities represent the same real world entity. Perhaps we need a
>> "sameFeatureAs" predicate to address this.
>>
>> @josh - can we clarify my understanding please?
>>
>> In the BP doc §4 "Spatial things, features and geometry" [1] I use a
>> lighthouse example, so I'll continue with that ...
>>
>> We have one real lighthouse (Eddystone Lighthouse) that is discerned as a
>> different Type by different communities: "VerticalObstruction" and
>> "MaritimeNavigationAid". In ISO 19100 parlance, these are two distinct
>> feature types. The two "Features" might be encoded in GML as follows
>> (forgive any errors in my illustrative example):
>>
>> <VerticalObstruction gml:id="a">
>>     <gml:name>Eddystone</gml:name>
>>     <gml:identifier codeSpace="
>> http://example.com/sar/features/vo/">EDY</gml:identifier>
>>     <geometry>
>>         <gml:Point gml:id="a-p1" srsDimension="2" srsName="EPSG:4326">
>>             <gml:pos>50.184 -4.268</gml:pos>
>>         </gml:Point>
>>     </geometry>
>>     <height uom="m">41</height>
>> </VerticalObstruction>
>>
>> <MaritimeNavigationAid gml:id="b">
>>     <gml:name>Eddystone Lighthouse</gml:name>
>>     <gml:identifier codeSpace="http://example.org/maritime/navaid/
>> ">2650253</gml:identifier>
>>     <geo>
>>         <gml:Point gml:id="b-p1" srsDimension="2" srsName="EPSG:4326">
>>             <gml:pos>50.2 -4.3</gml:pos>
>>         </gml:Point>
>>     </geo>
>>     <lightCharacteristic>
>>         ...
>>     </lightCharacteristic>
>> </MaritimeNavigationAid>
>>
>> So we have two Features (which we collectively have agreed are "spatial
>> things"), with identifiers <http://example.com/sar/features/vo/EDY> and <
>> http://example.org/maritime/navaid/2650253>. Respectively, the XML
>> elements that describe these features are identified as "a" and "b" using
>> the @gml:id attribute.
>>
>> If we are using "indirect identification" then _both_ <
>> http://example.com/sar/features/vo/EDY> and <
>> http://example.org/maritime/navaid/2650253> are treated as identifiers
>> for the _real_ Eddystone Lighthouse; we simply don't care to differentiate
>> between the real world thing and the information record. In which case,
>> <owl:sameAs>  would seem sufficient? The "height" and "lightCharacteristic"
>> properties are both applicable to the real Eddystone Lighthouse. Some
>> judgement would be required to decide which point geometry ("geo" or
>> "geometry" property) is considered "best".
>>
>> The way I think about it, @gml:id is more like the identifier for a named
>> graph; a container for a set of properties ...
>>
>> Am I missing something???
>>
>> Jeremy
>>
>>
>> [1]: http://w3c.github.io/sdw/bp/#spatial-things-features-and-geometry
>>
>> On Wed, 31 Aug 2016 at 12:42 Joshua Lieberman <
>> jlieberman@tumblingwalls.com> wrote:
>>
>>> If we are asserting that spatial data on the Web is "always" feature
>>> data that represents a real world entity, then yes, we don't have the
>>> general Web "is it or isn't it physical" ambiguity and can assume that a
>>> feature data identifier also and indirectly identifies the feature. That
>>> still leaves a gap in expressing whether two feature data entities
>>> represent the same real world entity. Perhaps we need a "sameFeatureAs"
>>> predicate to address this.
>>>
>>> Josh
>>>
>>> Joshua Lieberman, Ph.D.
>>> Principal, Tumbling Walls Consultancy
>>> Tel/Direct: +1 617-431-6431
>>> jlieberman@tumblingwalls.com
>>>
>>> On Aug 31, 2016, at 07:29, Frans Knibbe <frans.knibbe@geodan.nl> wrote:
>>>
>>> Hello,
>>>
>>> As stated before, I don't think the httpRange-14 problem exists in our
>>> domain of discourse. I think (and hope) that confusion can only occur when
>>> the things that are described are digital things, or things that can be
>>> transmitted over a computer network, like web pages or mail boxes. It seems
>>> to me that spatial things are never that type of thing. Therefore there is
>>> no reason to take precautions against possible confusion.
>>>
>>> That probably means +1.
>>>
>>> Greetings,
>>> Frans
>>>
>>>
>>>
>>> On 31 August 2016 at 09:50, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>>>
>>>> Thanks Rob & Clemens ...
>>>>
>>>> On Wed, 31 Aug 2016 at 08:30, Clemens Portele <
>>>> portele@interactive-instruments.de> wrote:
>>>>
>>>>> +1
>>>>>
>>>>>
>>>>> On 30 August 2016 at 10:10:26, Jeremy Tandy (jeremy.tandy@gmail.com)
>>>>> wrote:
>>>>>
>>>>> Hi. It would be good to close this issue out & include our collective
>>>>> recommendation in the BP doc working draft.
>>>>>
>>>>> PROPOSAL: SDW working group recommends use of "indirect identifiers"
>>>>> for spatial things
>>>>>
>>>>> ... I'll start the voting.
>>>>>
>>>>> +1
>>>>>
>>>>> Jeremy
>>>>>
>>>>> (BTW, to make sense of the PROPOSAL you'll need to read the email
>>>>> thread)
>>>>>
>>>>> On Fri, 26 Aug 2016 at 10:12 Linda van den Brink <
>>>>> l.vandenbrink@geonovum.nl> wrote:
>>>>>
>>>>>> So… do we agree we can recommend indirect identifiers, or do we try
>>>>>> to fix the issue with getting the correct identifier as Rob describes?
>>>>>>
>>>>>>
>>>>>> While waiting for this I’ve updated the issue and the text referring
>>>>>> to the issue in BP6.
>>>>>>
>>>>>>
>>>>>> *Van:* Rob Atkinson [mailto:rob@metalinkage.com.au]
>>>>>> *Verzonden:* woensdag 24 augustus 2016 13:56
>>>>>> *Aan:* Jeremy Tandy; Phil Archer; Linda van den Brink; Bill Roberts
>>>>>>
>>>>>>
>>>>>> *CC:* SDW WG Public List
>>>>>>
>>>>>> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for
>>>>>> spatial things"
>>>>>>
>>>>>>
>>>>>> Hi
>>>>>>
>>>>>>
>>>>>> Agree this is a real concern - people cant be blamed for doing the
>>>>>> obvious, if dumb, thing..
>>>>>>
>>>>>>
>>>>>> I think we should take note of best practice in the HTML world -
>>>>>> which is often to include a citable link to a resource in the rendered
>>>>>> view.  Or a "share" or something similar. We can also put fairly explicit
>>>>>> annotation in machine-readable code - stating that the resource is about
>>>>>> the URI - and even notes saying when citing this resource use the URI....
>>>>>>
>>>>>>
>>>>>> I'd also like to see browsers evolve to offer you the original link
>>>>>> or the redirected when cutting and pasting - how hard can it be!
>>>>>>
>>>>>>
>>>>>> Maybe we can get Ed to ask around Google Chrome team for suggestions
>>>>>> on how best to handle this :-)
>>>>>>
>>>>>>
>>>>>> Rob
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy <jeremy.tandy@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> Yes, I think so ... And we should do so if we are recommending
>>>>>> "indirect identification".
>>>>>>
>>>>>> Jeremy
>>>>>>
>>>>>> On Wed, 24 Aug 2016 at 09:24, Phil Archer <phila@w3.org> wrote:
>>>>>>
>>>>>> Bill's comments also made me think about some of the classic
>>>>>> arguments,
>>>>>> such as that a lake doesn't have a last updated date and isn't 435KB
>>>>>> big. Which are true, however, that kind of metadata generally comes
>>>>>> from
>>>>>> the server, i.e. the HTTP layer. That's an over simplification but the
>>>>>> point is that it is relatively easy to avoid deliberately creating
>>>>>> misleading metadata - metadata about the doc rather than the thing it
>>>>>> describes - and it's also generally easy to avoid looking for that
>>>>>> metadata.
>>>>>>
>>>>>> Is there scope for some BP advice there?
>>>>>>
>>>>>> Phil.
>>>>>>
>>>>>> On 24/08/2016 08:25, Jeremy Tandy wrote:
>>>>>> > Thanks Linda. More clear examples where being "correct" (in terms of
>>>>>> > avoiding uri collisions by using two distinct uris) is making
>>>>>> things worse
>>>>>> > because users take the wrong one!
>>>>>> >
>>>>>> > So, as a WG, are we content to recommend this "indirect
>>>>>> identification"
>>>>>> > pattern where thing & info resource identifiers are conflated?
>>>>>> >
>>>>>> > Bill has added some good points about how to avoid impacts of uri
>>>>>> > collision- by using the (dataset) metadata to talk about licenses
>>>>>> and
>>>>>> > creators for the information ...
>>>>>> > On Wed, 24 Aug 2016 at 07:52, Linda van den Brink <
>>>>>> l.vandenbrink@geonovum.nl>
>>>>>> > wrote:
>>>>>> >
>>>>>> >> Experience from the Netherlands: we have the id/doc pattern in our
>>>>>> URI
>>>>>> >> strategy, based on the Cool URIs note [8] and the ISA study on
>>>>>> persistent
>>>>>> >> identifiers [9].
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> That being said, same as Bill I also notice data users getting
>>>>>> confused
>>>>>> >> and generally using the /doc/  URI as that is the one they can
>>>>>> copy from
>>>>>> >> their browser address bar. This is not only casual confusion but
>>>>>> also ends
>>>>>> >> up in published information resources.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> You see this, for example, all over the CB-NL which is a
>>>>>> vocabulary for
>>>>>> >> the building sector and contains links to other Dutch standards
>>>>>> such as
>>>>>> >> IMGeo, an information model and vocabulary for large scale
>>>>>> topography. E.g.
>>>>>> >> the CB-NL concept of ‘Gebouw’ (Building) [10]  links to two IMGeo
>>>>>> concepts
>>>>>> >> ‘Pand’ (building part) and ‘Overig Bouwwerk’ (other construction)
>>>>>> using
>>>>>> >> their /doc/ URIs. If you click on Pand (which doesn’t have its own
>>>>>> landing
>>>>>> >> page in CB-NL so I can’t include the link) you will see it
>>>>>> includes the
>>>>>> >> /doc/  URI as the identifier of Pand.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> This is an example where it occurs in vocabularies, but I also see
>>>>>> it
>>>>>> >> happen with identifiers for data instances.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> [8]: https://www.w3.org/TR/cooluris/
>>>>>> >>
>>>>>> >> [9]:
>>>>>> >>
>>>>>> https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%20Study%20on%20persistent%20URIs_0.pdf
>>>>>> >> 10: http://ont.cbnl.org/cb/def/Gebouw
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Linda
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> *Van:* Jeremy Tandy [mailto:jeremy.tandy@gmail.com]
>>>>>> >> *Verzonden:* dinsdag 23 augustus 2016 20:57
>>>>>> >> *Aan:* Bill Roberts
>>>>>> >> *CC:* SDW WG Public List
>>>>>> >> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for
>>>>>> spatial
>>>>>> >> things"
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Thanks Bill. Sounds very coherent ... I hoped for some responses
>>>>>> such as
>>>>>> >> this based on practical experience. Jeremy
>>>>>> >>
>>>>>> >> On Tue, 23 Aug 2016 at 19:41, Bill Roberts <bill@swirrl.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> ah Jeremy, you are a brave man to poke the sleeping beast of
>>>>>> httpRange-14.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> But I'll get my thoughts in early, then I can tune out of the
>>>>>> ensuing mail
>>>>>> >> avalanche :-)
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> When publishing Linked Data about places we (at Swirrl) generally
>>>>>> do the
>>>>>> >> id/doc fandango, but to be honest I think data users either don't
>>>>>> notice,
>>>>>> >> or they get confused by it.  In the applications we are working
>>>>>> with (and I
>>>>>> >> acknowledge that others may have different applications and
>>>>>> different
>>>>>> >> experiences), it wouldn't cause any problems to have a single URI,
>>>>>> the 'id'
>>>>>> >> URI if you like.  We just don't find a need to say anything about
>>>>>> the /doc/
>>>>>> >> URI.  If we were starting again, I'd probably ditch the /doc/ and
>>>>>> the 303
>>>>>> >> and rely on context and a little bit of documentation to make it
>>>>>> clear what
>>>>>> >> we mean.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> The place where we find a need to talk about creators and licences
>>>>>> and
>>>>>> >> modified dates is in metadata about datasets where a dataset might
>>>>>> be a
>>>>>> >> collection of information about a bunch of places - and we treat
>>>>>> datasets
>>>>>> >> as an 'information resource'. If someone requests a dataset URI we
>>>>>> return a
>>>>>> >> status code of 200 and the dataset metadata as the response.  That
>>>>>> metadata
>>>>>> >> includes info on where to get all the contents of the dataset if
>>>>>> you want
>>>>>> >> that.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> By the way, though it's sensible and consistent, I find that the
>>>>>> implied
>>>>>> >> and parallel property stuff makes it more rather than less
>>>>>> complicated.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Bill
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On 23 August 2016 at 17:37, Jeremy Tandy <jeremy.tandy@gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> All-
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Linda has done a great job of consolidating the best practices are
>>>>>> use of
>>>>>> >> identifiers. We have just one [1] now.
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Reading though just now, it occurred to me that there's still an
>>>>>> open
>>>>>> >> issue about identifier assignment ...
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> W3C's Architecture of the World Wide Web constraint "URIs identify
>>>>>> a
>>>>>> >> single resource" [2] asserts "Assign distinct URIs to distinct
>>>>>> resources"
>>>>>> >> in order to avoid URI collisions [2a] which "often imposes a cost
>>>>>> in
>>>>>> >> communication due to the effort required to resolve ambiguities".
>>>>>> >> Discussions from earlier years in UK Gov Linked Data working group
>>>>>> (and
>>>>>> >> elsewhere) concluded that the "real world thing" and "information
>>>>>> resource
>>>>>> >> that describes the real world thing" are separate resources. I
>>>>>> think this
>>>>>> >> is based on a (purist?) view when working with RDF of needing to
>>>>>> be totally
>>>>>> >> clear on "what's the subject" of each triple ... the thing or the
>>>>>> document.
>>>>>> >> This manifests as URIs with `id` or `doc` included somewhere to
>>>>>> distinguish
>>>>>> >> between the resources and some RDF triples to clarify that the doc
>>>>>> resource
>>>>>> >> is talking about the thing resource etc..
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> (dangerously close to "httpRange-14" [3] here ... let's avoid that
>>>>>> bear
>>>>>> >> trap)
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Jeni Tennison's "URLs in Data Primer" draft TAG note captures this
>>>>>> >> practice in §5.3 "Publishing data" [4]:
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> ```
>>>>>> >>
>>>>>> >> Publishers can help enable more accurate merging of data from
>>>>>> different
>>>>>> >> sites if they support URLs for each entity
>>>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-entity> they or other
>>>>>> sites may
>>>>>> >> wish to describe, separate from the landing pages
>>>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> or records
>>>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-record> that they
>>>>>> publish.
>>>>>> >>
>>>>>> >> ```
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> Yet Architecture of the World Wide Web §2.2.3 "Indirect
>>>>>> identification"
>>>>>>
>>>>>>
Received on Thursday, 1 September 2016 22:25:39 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:31:25 UTC