W3C home > Mailing lists > Public > public-sdw-wg@w3.org > September 2016

Re: Clarification required: BP6 "use HTTP URIs for spatial things"

From: Krzysztof Janowicz <janowicz@ucsb.edu>
Date: Thu, 1 Sep 2016 20:05:39 -0700
To: Frans Knibbe <frans.knibbe@geodan.nl>, Simon Cox <Simon.Cox@csiro.au>
Cc: Joshua Lieberman <jlieberman@tumblingwalls.com>, Jeremy Tandy <jeremy.tandy@gmail.com>, SDW WG Public List <public-sdw-wg@w3.org>
Message-ID: <b4e1aa46-3b03-2bb5-bf73-f4650695b79a@ucsb.edu>
> One of which could that there can not be confusion between the URI of 
> a spatial thing and the URI of data describing that spatial thing.

I think we are making this unnecessarily complicated. Do we have a real 
(and relevant) use case explaining why all this really matters? Isn't 
what really matters that there is some data repository that contains 
statements about an URI and said URI points us to some entity 
(potentially in the real world)? Also, are we all sure that we are not 
simply confusing this with content negotiation, namely the fact that 
instead of a set of triples we can also receive a human readable 
description?

Krzysztof




On 09/01/2016 12:39 AM, Frans Knibbe wrote:
>
>
> On 1 September 2016 at 00:56, <Simon.Cox@csiro.au 
> <mailto:Simon.Cox@csiro.au>> wrote:
>
>     ØNow I would say the triangle is a spatial thing (not sure if it
>     counts as a real world entity, but I hope we can leave the idea of
>     'real world' out of definitions anyway).
>
>     This is important. In the OGC world we say that a triangle is not
>     a real world entity, since you can’t touch it.
>
>     You can touch something that has the shape of a triangle, but the
>     triangle itself is an abstract thing.
>
>     In contrast, in the GIS world, everything is subclassed from a
>     geometry (point, line, polygon).
>
>     This has some implementation benefits, in particular spatial
>     indexing because you only have one geometry per feature and they
>     can be sorted into three classes.
>
>
> In another thread I posted the following tentative definitions of the 
> two core concepts that seem to be crucial for a spatial ontology:
>
>  1. spatial things: things that have some kind of spatial presence;
>     can have spatial relationships
>  2. geometry: an ordered set of n-dimensional points; can be used to
>     model the spatial presence of a spatial thing
>
> Whether 'some kind of spatial presence' applies to the triangle in the 
> example is debatable. But that is not a bad thing. I would like it if 
> a spatial ontology is not too strict. For example. it should allow for 
> cases where the spatial thing does not exist in reality, or when the 
> space the object occupies is a virtual space, as in the case of the 
> triangle. If someone wants to share data about a triangle that only 
> exists in an SVGfile, or about the fictional island Atlantis, and they 
> want to model their data as spatial data, I think that should be 
> possible. Trying to build impenetrable semantic walls around 
> definitions of spatial data is a battle that will have little gain and 
> can never be won, I think.
>
> Still, if people will use our spatial ontology to model their data, 
> they will have to accept the premises that go with it. One of which 
> could that there can not be confusion between the URI of a spatial 
> thing and the URI of data describing that spatial thing.
>
> Regards,
> Frans
>
>     *From:*Frans Knibbe [mailto:frans.knibbe@geodan.nl
>     <mailto:frans.knibbe@geodan.nl>]
>     *Sent:* Thursday, 1 September 2016 12:21 AM
>     *To:* Joshua Lieberman <jlieberman@tumblingwalls.com
>     <mailto:jlieberman@tumblingwalls.com>>
>     *Cc:* Jeremy Tandy <jeremy.tandy@gmail.com
>     <mailto:jeremy.tandy@gmail.com>>; SDW WG Public List
>     <public-sdw-wg@w3.org <mailto:public-sdw-wg@w3.org>>
>     *Subject:* Re: Clarification required: BP6 "use HTTP URIs for
>     spatial things"
>
>     On 31 August 2016 at 13:42, Joshua Lieberman
>     <jlieberman@tumblingwalls.com
>     <mailto:jlieberman@tumblingwalls.com>> wrote:
>
>         If we are asserting that spatial data on the Web is "always"
>         feature data that represents a real world entity, then yes, we
>         don't have the general Web "is it or isn't it physical"
>         ambiguity and can assume that a feature data identifier also
>         and indirectly identifies the feature.
>
>     I hope we can broaden that assumption, that the assertion still
>     holds even if we are not talking about feature data representing
>     real world entities.
>
>     Let's look at a border case: I am drawing a triangle in Inkscape
>     and I save it as a *.svg file. I publish the file on the web, so
>     it has a URI. Now I would say the triangle is a spatial thing (not
>     sure if it counts as a real world entity, but I hope we can leave
>     the idea of 'real world' out of definitions anyway). The SVG
>     object in the file is the geometry describing the spatial thing. I
>     think that only if we understand the SVG file to be the spatial
>     thing we get into trouble. I might want to state that the file has
>     a certain size and that the triangle has a certain area. It would
>     be funny if I used the same URI for both statements. So I would
>     need to have a different URI for my triangle. Could that be all?
>
>         That still leaves a gap in expressing whether two feature data
>         entities represent the same real world entity. Perhaps we need
>         a "sameFeatureAs" predicate to address this.
>
>     Yes, that is what the Subject equality
>     <http://w3c.github.io/sdw/UseCases/SDWUseCasesAndRequirements.html#SubjectEquality>
>     requirement is about. So the BP document is expected to say
>     something about that.
>
>     Regards,
>
>     Frans
>
>         Josh
>
>         Joshua Lieberman, Ph.D.
>
>         Principal, Tumbling Walls Consultancy
>
>         Tel/Direct: +1 617-431-6431 <tel:%2B1%20617-431-6431>
>
>         jlieberman@tumblingwalls.com <mailto:jlieberman@tumblingwalls.com>
>
>
>         On Aug 31, 2016, at 07:29, Frans Knibbe
>         <frans.knibbe@geodan.nl <mailto:frans.knibbe@geodan.nl>> wrote:
>
>             Hello,
>
>             As stated before, I don't think the httpRange-14 problem
>             exists in our domain of discourse. I think (and hope) that
>             confusion can only occur when the things that are
>             described are digital things, or things that can be
>             transmitted over a computer network, like web pages or
>             mail boxes. It seems to me that spatial things are never
>             that type of thing. Therefore there is no reason to take
>             precautions against possible confusion.
>
>             That probably means +1.
>
>             Greetings,
>
>             Frans
>
>             On 31 August 2016 at 09:50, Jeremy Tandy
>             <jeremy.tandy@gmail.com <mailto:jeremy.tandy@gmail.com>>
>             wrote:
>
>                 Thanks Rob & Clemens ...
>
>                 On Wed, 31 Aug 2016 at 08:30, Clemens Portele
>                 <portele@interactive-instruments.de
>                 <mailto:portele@interactive-instruments.de>> wrote:
>
>                     +1
>
>                     On 30 August 2016 at 10:10:26, Jeremy Tandy
>                     (jeremy.tandy@gmail.com
>                     <mailto:jeremy.tandy@gmail.com>) wrote:
>
>                         Hi. It would be good to close this issue out &
>                         include our collective recommendation in the
>                         BP doc working draft.
>
>                         PROPOSAL: SDW working group recommends use of
>                         "indirect identifiers" for spatial things
>
>                         ... I'll start the voting.
>
>                         +1
>
>                         Jeremy
>
>                         (BTW, to make sense of the PROPOSAL you'll
>                         need to read the email thread)
>
>                         On Fri, 26 Aug 2016 at 10:12 Linda van den
>                         Brink <l.vandenbrink@geonovum.nl
>                         <mailto:l.vandenbrink@geonovum.nl>> wrote:
>
>                             So… do we agree we can recommend indirect
>                             identifiers, or do we try to fix the issue
>                             with getting the correct identifier as Rob
>                             describes?
>
>                             While waiting for this I’ve updated the
>                             issue and the text referring to the issue
>                             in BP6.
>
>                             *Van:*Rob Atkinson
>                             [mailto:rob@metalinkage.com.au
>                             <mailto:rob@metalinkage.com.au>]
>                             *Verzonden:* woensdag 24 augustus 2016 13:56
>                             *Aan:* Jeremy Tandy; Phil Archer; Linda
>                             van den Brink; Bill Roberts
>
>
>                             *CC:* SDW WG Public List
>
>                             *Onderwerp:*Re: Clarification required:
>                             BP6 "use HTTP URIs for spatial things"
>
>                             Hi
>
>                             Agree this is a real concern - people cant
>                             be blamed for doing the obvious, if dumb,
>                             thing..
>
>                             I think we should take note of best
>                             practice in the HTML world - which is
>                             often to include a citable link to a
>                             resource in the rendered view.  Or a
>                             "share" or something similar. We can also
>                             put fairly explicit annotation in
>                             machine-readable code - stating that the
>                             resource is about the URI - and even notes
>                             saying when citing this resource use the
>                             URI....
>
>                             I'd also like to see browsers evolve to
>                             offer you the original link or the
>                             redirected when cutting and pasting - how
>                             hard can it be!
>
>                             Maybe we can get Ed to ask around Google
>                             Chrome team for suggestions on how best to
>                             handle this :-)
>
>                             Rob
>
>                             On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy
>                             <jeremy.tandy@gmail.com
>                             <mailto:jeremy.tandy@gmail.com>> wrote:
>
>                                 Yes, I think so ... And we should do
>                                 so if we are recommending "indirect
>                                 identification".
>
>                                 Jeremy
>
>                                 On Wed, 24 Aug 2016 at 09:24, Phil
>                                 Archer <phila@w3.org
>                                 <mailto:phila@w3.org>> wrote:
>
>                                     Bill's comments also made me think
>                                     about some of the classic arguments,
>                                     such as that a lake doesn't have a
>                                     last updated date and isn't 435KB
>                                     big. Which are true, however, that
>                                     kind of metadata generally comes from
>                                     the server, i.e. the HTTP layer.
>                                     That's an over simplification but the
>                                     point is that it is relatively
>                                     easy to avoid deliberately creating
>                                     misleading metadata - metadata
>                                     about the doc rather than the thing it
>                                     describes - and it's also
>                                     generally easy to avoid looking
>                                     for that metadata.
>
>                                     Is there scope for some BP advice
>                                     there?
>
>                                     Phil.
>
>                                     On 24/08/2016 08:25, Jeremy Tandy
>                                     wrote:
>                                     > Thanks Linda. More clear
>                                     examples where being "correct" (in
>                                     terms of
>                                     > avoiding uri collisions by using
>                                     two distinct uris) is making
>                                     things worse
>                                     > because users take the wrong one!
>                                     >
>                                     > So, as a WG, are we content to
>                                     recommend this "indirect
>                                     identification"
>                                     > pattern where thing & info
>                                     resource identifiers are conflated?
>                                     >
>                                     > Bill has added some good points
>                                     about how to avoid impacts of uri
>                                     > collision- by using the
>                                     (dataset) metadata to talk about
>                                     licenses and
>                                     > creators for the information ...
>                                     > On Wed, 24 Aug 2016 at 07:52,
>                                     Linda van den Brink
>                                     <l.vandenbrink@geonovum.nl
>                                     <mailto:l.vandenbrink@geonovum.nl>>
>                                     > wrote:
>                                     >
>                                     >> Experience from the
>                                     Netherlands: we have the id/doc
>                                     pattern in our URI
>                                     >> strategy, based on the Cool
>                                     URIs note [8] and the ISA study on
>                                     persistent
>                                     >> identifiers [9].
>                                     >>
>                                     >>
>                                     >>
>                                     >> That being said, same as Bill I
>                                     also notice data users getting
>                                     confused
>                                     >> and generally using the /doc/ 
>                                     URI as that is the one they can
>                                     copy from
>                                     >> their browser address bar. This
>                                     is not only casual confusion but
>                                     also ends
>                                     >> up in published information
>                                     resources.
>                                     >>
>                                     >>
>                                     >>
>                                     >> You see this, for example, all
>                                     over the CB-NL which is a
>                                     vocabulary for
>                                     >> the building sector and
>                                     contains links to other Dutch
>                                     standards such as
>                                     >> IMGeo, an information model and
>                                     vocabulary for large scale
>                                     topography. E.g.
>                                     >> the CB-NL concept of ‘Gebouw’
>                                     (Building) [10]  links to two
>                                     IMGeo concepts
>                                     >> ‘Pand’ (building part) and
>                                     ‘Overig Bouwwerk’ (other
>                                     construction) using
>                                     >> their /doc/ URIs. If you click
>                                     on Pand (which doesn’t have its
>                                     own landing
>                                     >> page in CB-NL so I can’t
>                                     include the link) you will see it
>                                     includes the
>                                     >> /doc/  URI as the identifier of
>                                     Pand.
>                                     >>
>                                     >>
>                                     >>
>                                     >> This is an example where it
>                                     occurs in vocabularies, but I also
>                                     see it
>                                     >> happen with identifiers for
>                                     data instances.
>                                     >>
>                                     >>
>                                     >>
>                                     >> [8]:
>                                     https://www.w3.org/TR/cooluris/
>                                     <https://www.w3.org/TR/cooluris/>
>                                     >>
>                                     >> [9]:
>                                     >>
>                                     https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%20Study%20on%20persistent%20URIs_0.pdf
>                                     <https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%20Study%20on%20persistent%20URIs_0.pdf>
>                                     >> 10:
>                                     http://ont.cbnl.org/cb/def/Gebouw
>                                     <http://ont.cbnl.org/cb/def/Gebouw>
>                                     >>
>                                     >>
>                                     >>
>                                     >> Linda
>                                     >>
>                                     >>
>                                     >>
>                                     >> *Van:* Jeremy Tandy
>                                     [mailto:jeremy.tandy@gmail.com
>                                     <mailto:jeremy.tandy@gmail.com>]
>                                     >> *Verzonden:* dinsdag 23
>                                     augustus 2016 20:57
>                                     >> *Aan:* Bill Roberts
>                                     >> *CC:* SDW WG Public List
>                                     >> *Onderwerp:* Re: Clarification
>                                     required: BP6 "use HTTP URIs for
>                                     spatial
>                                     >> things"
>                                     >>
>                                     >>
>                                     >>
>                                     >> Thanks Bill. Sounds very
>                                     coherent ... I hoped for some
>                                     responses such as
>                                     >> this based on practical
>                                     experience. Jeremy
>                                     >>
>                                     >> On Tue, 23 Aug 2016 at 19:41,
>                                     Bill Roberts <bill@swirrl.com
>                                     <mailto:bill@swirrl.com>> wrote:
>                                     >>
>                                     >> ah Jeremy, you are a brave man
>                                     to poke the sleeping beast of
>                                     httpRange-14.
>                                     >>
>                                     >>
>                                     >>
>                                     >> But I'll get my thoughts in
>                                     early, then I can tune out of the
>                                     ensuing mail
>                                     >> avalanche :-)
>                                     >>
>                                     >>
>                                     >>
>                                     >> When publishing Linked Data
>                                     about places we (at Swirrl)
>                                     generally do the
>                                     >> id/doc fandango, but to be
>                                     honest I think data users either
>                                     don't notice,
>                                     >> or they get confused by it.  In
>                                     the applications we are working
>                                     with (and I
>                                     >> acknowledge that others may
>                                     have different applications and
>                                     different
>                                     >> experiences), it wouldn't cause
>                                     any problems to have a single URI,
>                                     the 'id'
>                                     >> URI if you like. We just don't
>                                     find a need to say anything about
>                                     the /doc/
>                                     >> URI. If we were starting again,
>                                     I'd probably ditch the /doc/ and
>                                     the 303
>                                     >> and rely on context and a
>                                     little bit of documentation to
>                                     make it clear what
>                                     >> we mean.
>                                     >>
>                                     >>
>                                     >>
>                                     >> The place where we find a need
>                                     to talk about creators and
>                                     licences and
>                                     >> modified dates is in metadata
>                                     about datasets where a dataset
>                                     might be a
>                                     >> collection of information about
>                                     a bunch of places - and we treat
>                                     datasets
>                                     >> as an 'information resource'.
>                                     If someone requests a dataset URI
>                                     we return a
>                                     >> status code of 200 and the
>                                     dataset metadata as the response.
>                                     That metadata
>                                     >> includes info on where to get
>                                     all the contents of the dataset if
>                                     you want
>                                     >> that.
>                                     >>
>                                     >>
>                                     >>
>                                     >> By the way, though it's
>                                     sensible and consistent, I find
>                                     that the implied
>                                     >> and parallel property stuff
>                                     makes it more rather than less
>                                     complicated.
>                                     >>
>                                     >>
>                                     >>
>                                     >> Bill
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >> On 23 August 2016 at 17:37,
>                                     Jeremy Tandy
>                                     <jeremy.tandy@gmail.com
>                                     <mailto:jeremy.tandy@gmail.com>>
>                                     wrote:
>                                     >>
>                                     >> All-
>                                     >>
>                                     >>
>                                     >>
>                                     >> Linda has done a great job of
>                                     consolidating the best practices
>                                     are use of
>                                     >> identifiers. We have just one
>                                     [1] now.
>                                     >>
>                                     >>
>                                     >>
>                                     >> Reading though just now, it
>                                     occurred to me that there's still
>                                     an open
>                                     >> issue about identifier
>                                     assignment ...
>                                     >>
>                                     >>
>                                     >>
>                                     >> W3C's Architecture of the World
>                                     Wide Web constraint "URIs identify a
>                                     >> single resource" [2] asserts
>                                     "Assign distinct URIs to distinct
>                                     resources"
>                                     >> in order to avoid URI
>                                     collisions [2a] which "often
>                                     imposes a cost in
>                                     >> communication due to the effort
>                                     required to resolve ambiguities".
>                                     >> Discussions from earlier years
>                                     in UK Gov Linked Data working
>                                     group (and
>                                     >> elsewhere) concluded that the
>                                     "real world thing" and
>                                     "information resource
>                                     >> that describes the real world
>                                     thing" are separate resources. I
>                                     think this
>                                     >> is based on a (purist?) view
>                                     when working with RDF of needing
>                                     to be totally
>                                     >> clear on "what's the subject"
>                                     of each triple ... the thing or
>                                     the document.
>                                     >> This manifests as URIs with
>                                     `id` or `doc` included somewhere
>                                     to distinguish
>                                     >> between the resources and some
>                                     RDF triples to clarify that the
>                                     doc resource
>                                     >> is talking about the thing
>                                     resource etc..
>                                     >>
>                                     >>
>                                     >>
>                                     >> (dangerously close to
>                                     "httpRange-14" [3] here ... let's
>                                     avoid that bear
>                                     >> trap)
>                                     >>
>                                     >>
>                                     >>
>                                     >> Jeni Tennison's "URLs in Data
>                                     Primer" draft TAG note captures this
>                                     >> practice in §5.3 "Publishing
>                                     data" [4]:
>                                     >>
>                                     >>
>                                     >>
>                                     >> ```
>                                     >>
>                                     >> Publishers can help enable more
>                                     accurate merging of data from
>                                     different
>                                     >> sites if they support URLs for
>                                     each entity
>                                     >>
>                                     <https://www.w3.org/TR/urls-in-data/#dfn-entity
>                                     <https://www.w3.org/TR/urls-in-data/#dfn-entity>>
>                                     they or other sites may
>                                     >> wish to describe, separate from
>                                     the landing pages
>                                     >>
>                                     <https://www.w3.org/TR/urls-in-data/#dfn-landing-page
>                                     <https://www.w3.org/TR/urls-in-data/#dfn-landing-page>>
>                                     or records
>                                     >>
>                                     <https://www.w3.org/TR/urls-in-data/#dfn-record
>                                     <https://www.w3.org/TR/urls-in-data/#dfn-record>>
>                                     that they publish.
>                                     >>
>                                     >> ```
>                                     >>
>                                     >>
>                                     >>
>                                     >> Yet Architecture of the World
>                                     Wide Web §2.2.3 "Indirect
>                                     identification"
>                                     >> [5] notes that:
>                                     >>
>                                     >>
>                                     >>
>                                     >> ```
>                                     >>
>                                     >> To say that the URI
>                                     "mailto:nadia@example.com
>                                     <mailto:nadia@example.com>"
>                                     identifies both an
>                                     >> Internet mailbox and Nadia, the
>                                     person, introduces a URI collision.
>                                     >> However, we can use the URI to
>                                     indirectly identify Nadia.
>                                     Identifiers are
>                                     >> commonly used in this way.
>                                     >>
>                                     >> ```
>                                     >>
>                                     >>
>                                     >>
>                                     >> This is consistent with what I
>                                     recall TimBL saying at TPAC-2015
>                                     in regards
>                                     >> to Vcard; come the finish, no
>                                     one really cares to distinguish
>                                     between the
>                                     >> thing and its associated
>                                     information resource.
>                                     >>
>                                     >>
>                                     >>
>                                     >> ... And in most cases, one can
>                                     use context to determine whether a
>                                     >> statement concerns the thing or
>                                     the information resource. In those
>                                     cases
>                                     >> where you can't, "URLs in Data
>                                     Primer" suggests some mechanisms
>                                     to mitigate
>                                     >> such confusion [6][7].
>                                     >>
>                                     >>
>                                     >>
>                                     >> I think that in our SDW WG
>                                     discussion we have concluded that
>                                     we _are_
>                                     >> content to use "indirect
>                                     identification" - e.g. that we use
>                                     URIs that
>                                     >> conflate the thing and document
>                                     resource.
>                                     >>
>                                     >>
>                                     >>
>                                     >> Please can we confirm this?
>                                     Assuming that indirect
>                                     identification is
>                                     >> "approved" as best practice,
>                                     then it seems prudent to add a
>                                     note to the BP
>                                     >> document saying "don't worry
>                                     about distinguishing between thing and
>                                     >> resource; indirect
>                                     identification is fine" (etc.)
>                                     >>
>                                     >>
>                                     >>
>                                     >> Thanks, Jeremy
>                                     >>
>                                     >>
>                                     >>
>                                     >> [1]:
>                                     http://w3c.github.io/sdw/bp/#globally-unique-ids
>                                     <http://w3c.github.io/sdw/bp/#globally-unique-ids>
>                                     >>
>                                     >> [2]:
>                                     https://www.w3.org/TR/webarch/#pr-uri-collision
>                                     <https://www.w3.org/TR/webarch/#pr-uri-collision>
>                                     >>
>                                     >> [2a]:
>                                     https://www.w3.org/TR/webarch/#URI-collision
>                                     <https://www.w3.org/TR/webarch/#URI-collision>
>                                     >>
>                                     >> [3]:
>                                     https://www.w3.org/2001/tag/group/track/issues/14
>                                     <https://www.w3.org/2001/tag/group/track/issues/14>
>                                     >>
>                                     >> [4]:
>                                     https://www.w3.org/TR/urls-in-data/#publishing-data
>                                     <https://www.w3.org/TR/urls-in-data/#publishing-data>
>                                     >>
>                                     >> [5]:
>                                     https://www.w3.org/TR/webarch/#indirect-identification
>                                     <https://www.w3.org/TR/webarch/#indirect-identification>
>                                     >>
>                                     >> [6]:
>                                     https://www.w3.org/TR/urls-in-data/#documenting-properties
>                                     <https://www.w3.org/TR/urls-in-data/#documenting-properties>
>                                     >>
>                                     >> [7]:
>                                     https://www.w3.org/TR/urls-in-data/#authoring-specifications
>                                     <https://www.w3.org/TR/urls-in-data/#authoring-specifications>
>                                     >>
>                                     >>
>                                     >>
>                                     >>
>                                     >
>
>                                     --
>
>
>                                     Phil Archer
>                                     W3C Data Activity Lead
>                                     http://www.w3.org/2013/data/
>
>                                     http://philarcher.org
>                                     +44 (0)7887 767755
>                                     <tel:%2B44%20%280%297887%20767755>
>                                     @philarcher1
>
>


-- 
Krzysztof Janowicz

Geography Department, University of California, Santa Barbara
4830 Ellison Hall, Santa Barbara, CA 93106-4060

Email: jano@geog.ucsb.edu
Webpage: http://geog.ucsb.edu/~jano/
Semantic Web Journal: http://www.semantic-web-journal.net
Received on Friday, 2 September 2016 03:06:14 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:31:25 UTC