W3C home > Mailing lists > Public > public-sdw-wg@w3.org > August 2016

Re: Clarification required: BP6 "use HTTP URIs for spatial things"

From: Krzysztof Janowicz <janowicz@ucsb.edu>
Date: Wed, 31 Aug 2016 10:13:06 -0700
To: Jeremy Tandy <jeremy.tandy@gmail.com>, Joshua Lieberman <jlieberman@tumblingwalls.com>, Frans Knibbe <frans.knibbe@geodan.nl>
Cc: SDW WG Public List <public-sdw-wg@w3.org>
Message-ID: <444ce21e-cc4e-2ffa-b240-c959794ab9b2@ucsb.edu>
Thanks for the clarification.

On 08/31/2016 09:22 AM, Jeremy Tandy wrote:
> @josh
>
> > Is the triangle spatial data or a graphic with drawing instructions 
> that assumes a certain technology? If 100 people print out the SVG, 
> there is really nothing to indicate that the underlying entity is the 
> same on each piece of paper, just that the same instructions were 
> used, unless we want to get into trademark issues.
>
> This seems to be getting away from the main topic. Unless you object, 
> can I pull us back?
>
> @Krzysztof
>
> Apologies if my terminology is confusing.
>
> I was trying to say that <owl:sameAs> indicates that two identifiers 
> (URIs, in this case <http://example.com/sar/features/vo/EDY> and 
> <http://example.org/maritime/navaid/2650253>) refer to the same entity 
> (Eddystone Lighthouse). You said it much better than me.
>
> The term "representation" was drawn from @josh's email text; in which 
> he meant "Eddystone Lighthouse seen as a vertical obstruction" and 
> "Eddystone Lighthouse seen as a maritime navigation aid".
>
> >there is no need for such a class [whose members are both vertical 
> obstructions and maritime navigation aids] (which you can define if 
> you really want to, but it could lead to a combinatorial explosion)
>
> I agree. This is what I've seen with Linked Data implementations - 
> which means that "sameRealWorldEntityAs" is not required.
>
> Hmmm. I hope I'm not confusing myself and everyone else.
>
> Jeremy
>
> On Wed, 31 Aug 2016 at 17:02 Joshua Lieberman 
> <jlieberman@tumblingwalls.com <mailto:jlieberman@tumblingwalls.com>> 
> wrote:
>
>>     On Aug 31, 2016, at 10:20 AM, Frans Knibbe
>>     <frans.knibbe@geodan.nl <mailto:frans.knibbe@geodan.nl>> wrote:
>>
>>
>>
>>     On 31 August 2016 at 13:42, Joshua Lieberman
>>     <jlieberman@tumblingwalls.com
>>     <mailto:jlieberman@tumblingwalls.com>> wrote:
>>
>>         If we are asserting that spatial data on the Web is "always"
>>         feature data that represents a real world entity, then yes,
>>         we don't have the general Web "is it or isn't it physical"
>>         ambiguity and can assume that a feature data identifier also
>>         and indirectly identifies the feature.
>>
>>
>>     I hope we can broaden that assumption, that the assertion still
>>     holds even if we are not talking about feature data representing
>>     real world entities.
>>
>>     Let's look at a border case: I am drawing a triangle in Inkscape
>>     and I save it as a *.svg file. I publish the file on the web, so
>>     it has a URI. Now I would say the triangle is a spatial thing
>>     (not sure if it counts as a real world entity, but I hope we can
>>     leave the idea of 'real world' out of definitions anyway). The
>>     SVG object in the file is the geometry describing the spatial
>>     thing. I think that only if we understand the SVG file to be the
>>     spatial thing we get into trouble. I might want to state that the
>>     file has a certain size and that the triangle has a certain area.
>>     It would be funny if I used the same URI for both statements. So
>>     I would need to have a different URI for my triangle. Could that
>>     be all?
>
>     Is the triangle spatial data or a graphic with drawing
>     instructions that assumes a certain technology? If 100 people
>     print out the SVG, there is really nothing to indicate that the
>     underlying entity is the same on each piece of paper, just that
>     the same instructions were used, unless we want to get into
>     trademark issues.
>
>>         That still leaves a gap in expressing whether two feature
>>         data entities represent the same real world entity. Perhaps
>>         we need a "sameFeatureAs" predicate to address this.
>>
>>
>>     Yes, that is what the Subject equality
>>     <http://w3c.github.io/sdw/UseCases/SDWUseCasesAndRequirements.html#SubjectEquality>
>>     requirement is about. So the BP document is expected to say
>>     something about that.
>>
>>     Regards,
>>     Frans
>>
>>
>>         Josh
>>
>>         Joshua Lieberman, Ph.D.
>>         Principal, Tumbling Walls Consultancy
>>         Tel/Direct: +1 617-431-6431 <tel:%2B1%20617-431-6431>
>>         jlieberman@tumblingwalls.com
>>         <mailto:jlieberman@tumblingwalls.com>
>>
>>         On Aug 31, 2016, at 07:29, Frans Knibbe
>>         <frans.knibbe@geodan.nl <mailto:frans.knibbe@geodan.nl>> wrote:
>>
>>>         Hello,
>>>
>>>         As stated before, I don't think the httpRange-14 problem
>>>         exists in our domain of discourse. I think (and hope) that
>>>         confusion can only occur when the things that are described
>>>         are digital things, or things that can be transmitted over a
>>>         computer network, like web pages or mail boxes. It seems to
>>>         me that spatial things are never that type of thing.
>>>         Therefore there is no reason to take precautions against
>>>         possible confusion.
>>>
>>>         That probably means +1.
>>>
>>>         Greetings,
>>>         Frans
>>>
>>>
>>>         On 31 August 2016 at 09:50, Jeremy Tandy
>>>         <jeremy.tandy@gmail.com <mailto:jeremy.tandy@gmail.com>> wrote:
>>>
>>>             Thanks Rob & Clemens ...
>>>
>>>             On Wed, 31 Aug 2016 at 08:30, Clemens Portele
>>>             <portele@interactive-instruments.de
>>>             <mailto:portele@interactive-instruments.de>> wrote:
>>>
>>>                 +1
>>>
>>>
>>>                 On 30 August 2016 at 10:10:26, Jeremy Tandy
>>>                 (jeremy.tandy@gmail.com
>>>                 <mailto:jeremy.tandy@gmail.com>) wrote:
>>>
>>>>                 Hi. It would be good to close this issue out &
>>>>                 include our collective recommendation in the BP doc
>>>>                 working draft.
>>>>
>>>>                 PROPOSAL: SDW working group recommends use of
>>>>                 "indirect identifiers" for spatial things
>>>>
>>>>                 ... I'll start the voting.
>>>>
>>>>                 +1
>>>>
>>>>                 Jeremy
>>>>
>>>>                 (BTW, to make sense of the PROPOSAL you'll need to
>>>>                 read the email thread)
>>>>
>>>>                 On Fri, 26 Aug 2016 at 10:12 Linda van den Brink
>>>>                 <l.vandenbrink@geonovum.nl
>>>>                 <mailto:l.vandenbrink@geonovum.nl>> wrote:
>>>>
>>>>                     So… do we agree we can recommend indirect
>>>>                     identifiers, or do we try to fix the issue with
>>>>                     getting the correct identifier as Rob describes?
>>>>
>>>>
>>>>                     While waiting for this I’ve updated the issue
>>>>                     and the text referring to the issue in BP6.
>>>>
>>>>
>>>>                     *Van:* Rob Atkinson
>>>>                     [mailto:rob@metalinkage.com.au
>>>>                     <mailto:rob@metalinkage.com.au>]
>>>>                     *Verzonden:* woensdag 24 augustus 2016 13:56
>>>>                     *Aan:* Jeremy Tandy; Phil Archer; Linda van den
>>>>                     Brink; Bill Roberts
>>>>
>>>>
>>>>                     *CC:* SDW WG Public List
>>>>
>>>>                     *Onderwerp:* Re: Clarification required: BP6
>>>>                     "use HTTP URIs for spatial things"
>>>>
>>>>
>>>>                     Hi
>>>>
>>>>
>>>>                     Agree this is a real concern - people cant be
>>>>                     blamed for doing the obvious, if dumb, thing..
>>>>
>>>>
>>>>                     I think we should take note of best practice in
>>>>                     the HTML world - which is often to include a
>>>>                     citable link to a resource in the rendered
>>>>                     view.  Or a "share" or something similar. We
>>>>                     can also put fairly explicit annotation in
>>>>                     machine-readable code - stating that the
>>>>                     resource is about the URI - and even notes
>>>>                     saying when citing this resource use the URI....
>>>>
>>>>
>>>>                     I'd also like to see browsers evolve to offer
>>>>                     you the original link or the redirected when
>>>>                     cutting and pasting - how hard can it be!
>>>>
>>>>
>>>>                     Maybe we can get Ed to ask around Google Chrome
>>>>                     team for suggestions on how best to handle this :-)
>>>>
>>>>
>>>>                     Rob
>>>>
>>>>
>>>>
>>>>
>>>>                     On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy
>>>>                     <jeremy.tandy@gmail.com
>>>>                     <mailto:jeremy.tandy@gmail.com>> wrote:
>>>>
>>>>                         Yes, I think so ... And we should do so if
>>>>                         we are recommending "indirect identification".
>>>>
>>>>                         Jeremy
>>>>
>>>>                         On Wed, 24 Aug 2016 at 09:24, Phil Archer
>>>>                         <phila@w3.org <mailto:phila@w3.org>> wrote:
>>>>
>>>>                             Bill's comments also made me think
>>>>                             about some of the classic arguments,
>>>>                             such as that a lake doesn't have a last
>>>>                             updated date and isn't 435KB
>>>>                             big. Which are true, however, that kind
>>>>                             of metadata generally comes from
>>>>                             the server, i.e. the HTTP layer. That's
>>>>                             an over simplification but the
>>>>                             point is that it is relatively easy to
>>>>                             avoid deliberately creating
>>>>                             misleading metadata - metadata about
>>>>                             the doc rather than the thing it
>>>>                             describes - and it's also generally
>>>>                             easy to avoid looking for that metadata.
>>>>
>>>>                             Is there scope for some BP advice there?
>>>>
>>>>                             Phil.
>>>>
>>>>                             On 24/08/2016 08:25, Jeremy Tandy wrote:
>>>>                             > Thanks Linda. More clear examples
>>>>                             where being "correct" (in terms of
>>>>                             > avoiding uri collisions by using two
>>>>                             distinct uris) is making things worse
>>>>                             > because users take the wrong one!
>>>>                             >
>>>>                             > So, as a WG, are we content to
>>>>                             recommend this "indirect identification"
>>>>                             > pattern where thing & info resource
>>>>                             identifiers are conflated?
>>>>                             >
>>>>                             > Bill has added some good points about
>>>>                             how to avoid impacts of uri
>>>>                             > collision- by using the (dataset)
>>>>                             metadata to talk about licenses and
>>>>                             > creators for the information ...
>>>>                             > On Wed, 24 Aug 2016 at 07:52, Linda
>>>>                             van den Brink
>>>>                             <l.vandenbrink@geonovum.nl
>>>>                             <mailto:l.vandenbrink@geonovum.nl>>
>>>>                             > wrote:
>>>>                             >
>>>>                             >> Experience from the Netherlands: we
>>>>                             have the id/doc pattern in our URI
>>>>                             >> strategy, based on the Cool URIs
>>>>                             note [8] and the ISA study on persistent
>>>>                             >> identifiers [9].
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> That being said, same as Bill I also
>>>>                             notice data users getting confused
>>>>                             >> and generally using the /doc/  URI
>>>>                             as that is the one they can copy from
>>>>                             >> their browser address bar. This is
>>>>                             not only casual confusion but also ends
>>>>                             >> up in published information resources.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> You see this, for example, all over
>>>>                             the CB-NL which is a vocabulary for
>>>>                             >> the building sector and contains
>>>>                             links to other Dutch standards such as
>>>>                             >> IMGeo, an information model and
>>>>                             vocabulary for large scale topography. E.g.
>>>>                             >> the CB-NL concept of ‘Gebouw’
>>>>                             (Building) [10]  links to two IMGeo
>>>>                             concepts
>>>>                             >> ‘Pand’ (building part) and ‘Overig
>>>>                             Bouwwerk’ (other construction) using
>>>>                             >> their /doc/ URIs. If you click on
>>>>                             Pand (which doesn’t have its own landing
>>>>                             >> page in CB-NL so I can’t include the
>>>>                             link) you will see it includes the
>>>>                             >> /doc/  URI as the identifier of Pand.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> This is an example where it occurs
>>>>                             in vocabularies, but I also see it
>>>>                             >> happen with identifiers for data
>>>>                             instances.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> [8]: https://www.w3.org/TR/cooluris/
>>>>                             >>
>>>>                             >> [9]:
>>>>                             >>
>>>>                             https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%20Study%20on%20persistent%20URIs_0.pdf
>>>>                             >> 10: http://ont.cbnl.org/cb/def/Gebouw
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Linda
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> *Van:* Jeremy Tandy
>>>>                             [mailto:jeremy.tandy@gmail.com
>>>>                             <mailto:jeremy.tandy@gmail.com>]
>>>>                             >> *Verzonden:* dinsdag 23 augustus
>>>>                             2016 20:57
>>>>                             >> *Aan:* Bill Roberts
>>>>                             >> *CC:* SDW WG Public List
>>>>                             >> *Onderwerp:* Re: Clarification
>>>>                             required: BP6 "use HTTP URIs for spatial
>>>>                             >> things"
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Thanks Bill. Sounds very coherent
>>>>                             ... I hoped for some responses such as
>>>>                             >> this based on practical experience.
>>>>                             Jeremy
>>>>                             >>
>>>>                             >> On Tue, 23 Aug 2016 at 19:41, Bill
>>>>                             Roberts <bill@swirrl.com
>>>>                             <mailto:bill@swirrl.com>> wrote:
>>>>                             >>
>>>>                             >> ah Jeremy, you are a brave man to
>>>>                             poke the sleeping beast of httpRange-14.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> But I'll get my thoughts in early,
>>>>                             then I can tune out of the ensuing mail
>>>>                             >> avalanche :-)
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> When publishing Linked Data about
>>>>                             places we (at Swirrl) generally do the
>>>>                             >> id/doc fandango, but to be honest I
>>>>                             think data users either don't notice,
>>>>                             >> or they get confused by it.  In the
>>>>                             applications we are working with (and I
>>>>                             >> acknowledge that others may have
>>>>                             different applications and different
>>>>                             >> experiences), it wouldn't cause any
>>>>                             problems to have a single URI, the 'id'
>>>>                             >> URI if you like. We just don't find
>>>>                             a need to say anything about the /doc/
>>>>                             >> URI. If we were starting again, I'd
>>>>                             probably ditch the /doc/ and the 303
>>>>                             >> and rely on context and a little bit
>>>>                             of documentation to make it clear what
>>>>                             >> we mean.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> The place where we find a need to
>>>>                             talk about creators and licences and
>>>>                             >> modified dates is in metadata about
>>>>                             datasets where a dataset might be a
>>>>                             >> collection of information about a
>>>>                             bunch of places - and we treat datasets
>>>>                             >> as an 'information resource'. If
>>>>                             someone requests a dataset URI we return a
>>>>                             >> status code of 200 and the dataset
>>>>                             metadata as the response. That metadata
>>>>                             >> includes info on where to get all
>>>>                             the contents of the dataset if you want
>>>>                             >> that.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> By the way, though it's sensible and
>>>>                             consistent, I find that the implied
>>>>                             >> and parallel property stuff makes it
>>>>                             more rather than less complicated.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Bill
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> On 23 August 2016 at 17:37, Jeremy
>>>>                             Tandy <jeremy.tandy@gmail.com
>>>>                             <mailto:jeremy.tandy@gmail.com>> wrote:
>>>>                             >>
>>>>                             >> All-
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Linda has done a great job of
>>>>                             consolidating the best practices are use of
>>>>                             >> identifiers. We have just one [1] now.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Reading though just now, it occurred
>>>>                             to me that there's still an open
>>>>                             >> issue about identifier assignment ...
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> W3C's Architecture of the World Wide
>>>>                             Web constraint "URIs identify a
>>>>                             >> single resource" [2] asserts "Assign
>>>>                             distinct URIs to distinct resources"
>>>>                             >> in order to avoid URI collisions
>>>>                             [2a] which "often imposes a cost in
>>>>                             >> communication due to the effort
>>>>                             required to resolve ambiguities".
>>>>                             >> Discussions from earlier years in UK
>>>>                             Gov Linked Data working group (and
>>>>                             >> elsewhere) concluded that the "real
>>>>                             world thing" and "information resource
>>>>                             >> that describes the real world thing"
>>>>                             are separate resources. I think this
>>>>                             >> is based on a (purist?) view when
>>>>                             working with RDF of needing to be totally
>>>>                             >> clear on "what's the subject" of
>>>>                             each triple ... the thing or the document.
>>>>                             >> This manifests as URIs with `id` or
>>>>                             `doc` included somewhere to distinguish
>>>>                             >> between the resources and some RDF
>>>>                             triples to clarify that the doc resource
>>>>                             >> is talking about the thing resource
>>>>                             etc..
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> (dangerously close to "httpRange-14"
>>>>                             [3] here ... let's avoid that bear
>>>>                             >> trap)
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Jeni Tennison's "URLs in Data
>>>>                             Primer" draft TAG note captures this
>>>>                             >> practice in §5.3 "Publishing data" [4]:
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> ```
>>>>                             >>
>>>>                             >> Publishers can help enable more
>>>>                             accurate merging of data from different
>>>>                             >> sites if they support URLs for each
>>>>                             entity
>>>>                             >>
>>>>                             <https://www.w3.org/TR/urls-in-data/#dfn-entity>
>>>>                             they or other sites may
>>>>                             >> wish to describe, separate from the
>>>>                             landing pages
>>>>                             >>
>>>>                             <https://www.w3.org/TR/urls-in-data/#dfn-landing-page>
>>>>                             or records
>>>>                             >>
>>>>                             <https://www.w3.org/TR/urls-in-data/#dfn-record>
>>>>                             that they publish.
>>>>                             >>
>>>>                             >> ```
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Yet Architecture of the World Wide
>>>>                             Web §2.2.3 "Indirect identification"
>>>>                             >> [5] notes that:
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> ```
>>>>                             >>
>>>>                             >> To say that the URI
>>>>                             "mailto:nadia@example.com
>>>>                             <mailto:nadia@example.com>" identifies
>>>>                             both an
>>>>                             >> Internet mailbox and Nadia, the
>>>>                             person, introduces a URI collision.
>>>>                             >> However, we can use the URI to
>>>>                             indirectly identify Nadia. Identifiers are
>>>>                             >> commonly used in this way.
>>>>                             >>
>>>>                             >> ```
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> This is consistent with what I
>>>>                             recall TimBL saying at TPAC-2015 in regards
>>>>                             >> to Vcard; come the finish, no one
>>>>                             really cares to distinguish between the
>>>>                             >> thing and its associated information
>>>>                             resource.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> ... And in most cases, one can use
>>>>                             context to determine whether a
>>>>                             >> statement concerns the thing or the
>>>>                             information resource. In those cases
>>>>                             >> where you can't, "URLs in Data
>>>>                             Primer" suggests some mechanisms to
>>>>                             mitigate
>>>>                             >> such confusion [6][7].
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> I think that in our SDW WG
>>>>                             discussion we have concluded that we _are_
>>>>                             >> content to use "indirect
>>>>                             identification" - e.g. that we use URIs
>>>>                             that
>>>>                             >> conflate the thing and document
>>>>                             resource.
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Please can we confirm this? Assuming
>>>>                             that indirect identification is
>>>>                             >> "approved" as best practice, then it
>>>>                             seems prudent to add a note to the BP
>>>>                             >> document saying "don't worry about
>>>>                             distinguishing between thing and
>>>>                             >> resource; indirect identification is
>>>>                             fine" (etc.)
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> Thanks, Jeremy
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >> [1]:
>>>>                             http://w3c.github.io/sdw/bp/#globally-unique-ids
>>>>                             >>
>>>>                             >> [2]:
>>>>                             https://www.w3.org/TR/webarch/#pr-uri-collision
>>>>                             >>
>>>>                             >> [2a]:
>>>>                             https://www.w3.org/TR/webarch/#URI-collision
>>>>                             >>
>>>>                             >> [3]:
>>>>                             https://www.w3.org/2001/tag/group/track/issues/14
>>>>                             >>
>>>>                             >> [4]:
>>>>                             https://www.w3.org/TR/urls-in-data/#publishing-data
>>>>                             >>
>>>>                             >> [5]:
>>>>                             https://www.w3.org/TR/webarch/#indirect-identification
>>>>                             >>
>>>>                             >> [6]:
>>>>                             https://www.w3.org/TR/urls-in-data/#documenting-properties
>>>>                             >>
>>>>                             >> [7]:
>>>>                             https://www.w3.org/TR/urls-in-data/#authoring-specifications
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >>
>>>>                             >
>>>>
>>>>                             --
>>>>
>>>>
>>>>                             Phil Archer
>>>>                             W3C Data Activity Lead
>>>>                             http://www.w3.org/2013/data/
>>>>
>>>>                             http://philarcher.org
>>>>                             <http://philarcher.org/>
>>>>                             +44 (0)7887 767755
>>>>                             <tel:%2B44%20%280%297887%20767755>
>>>>                             @philarcher1
>>>>
>>>
>>


-- 
Krzysztof Janowicz

Geography Department, University of California, Santa Barbara
4830 Ellison Hall, Santa Barbara, CA 93106-4060

Email: jano@geog.ucsb.edu
Webpage: http://geog.ucsb.edu/~jano/
Semantic Web Journal: http://www.semantic-web-journal.net
Received on Wednesday, 31 August 2016 17:13:43 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:31:25 UTC