W3C home > Mailing lists > Public > public-sdw-wg@w3.org > August 2016

Re: Clarification required: BP6 "use HTTP URIs for spatial things"

From: Frans Knibbe <frans.knibbe@geodan.nl>
Date: Wed, 31 Aug 2016 13:29:11 +0200
Message-ID: <CAFVDz40nNV_jJth86SQRnOakrETc5bLT_eZyT37BcLMy+nkQZg@mail.gmail.com>
To: Jeremy Tandy <jeremy.tandy@gmail.com>
Cc: SDW WG Public List <public-sdw-wg@w3.org>
Hello,

As stated before, I don't think the httpRange-14 problem exists in our
domain of discourse. I think (and hope) that confusion can only occur when
the things that are described are digital things, or things that can be
transmitted over a computer network, like web pages or mail boxes. It seems
to me that spatial things are never that type of thing. Therefore there is
no reason to take precautions against possible confusion.

That probably means +1.

Greetings,
Frans



On 31 August 2016 at 09:50, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:

> Thanks Rob & Clemens ...
>
> On Wed, 31 Aug 2016 at 08:30, Clemens Portele <portele@interactive-
> instruments.de> wrote:
>
>> +1
>>
>>
>> On 30 August 2016 at 10:10:26, Jeremy Tandy (jeremy.tandy@gmail.com)
>> wrote:
>>
>> Hi. It would be good to close this issue out & include our collective
>> recommendation in the BP doc working draft.
>>
>> PROPOSAL: SDW working group recommends use of "indirect identifiers" for
>> spatial things
>>
>> ... I'll start the voting.
>>
>> +1
>>
>> Jeremy
>>
>> (BTW, to make sense of the PROPOSAL you'll need to read the email thread)
>>
>> On Fri, 26 Aug 2016 at 10:12 Linda van den Brink <
>> l.vandenbrink@geonovum.nl> wrote:
>>
>>> So… do we agree we can recommend indirect identifiers, or do we try to
>>> fix the issue with getting the correct identifier as Rob describes?
>>>
>>>
>>>
>>> While waiting for this I’ve updated the issue and the text referring to
>>> the issue in BP6.
>>>
>>>
>>>
>>> *Van:* Rob Atkinson [mailto:rob@metalinkage.com.au]
>>> *Verzonden:* woensdag 24 augustus 2016 13:56
>>> *Aan:* Jeremy Tandy; Phil Archer; Linda van den Brink; Bill Roberts
>>>
>>>
>>> *CC:* SDW WG Public List
>>>
>>> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for spatial
>>> things"
>>>
>>>
>>>
>>> Hi
>>>
>>>
>>>
>>> Agree this is a real concern - people cant be blamed for doing the
>>> obvious, if dumb, thing..
>>>
>>>
>>>
>>> I think we should take note of best practice in the HTML world - which
>>> is often to include a citable link to a resource in the rendered view.  Or
>>> a "share" or something similar. We can also put fairly explicit annotation
>>> in machine-readable code - stating that the resource is about the URI - and
>>> even notes saying when citing this resource use the URI....
>>>
>>>
>>>
>>> I'd also like to see browsers evolve to offer you the original link or
>>> the redirected when cutting and pasting - how hard can it be!
>>>
>>>
>>>
>>> Maybe we can get Ed to ask around Google Chrome team for suggestions on
>>> how best to handle this :-)
>>>
>>>
>>>
>>> Rob
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy <jeremy.tandy@gmail.com>
>>> wrote:
>>>
>>> Yes, I think so ... And we should do so if we are recommending "indirect
>>> identification".
>>>
>>> Jeremy
>>>
>>> On Wed, 24 Aug 2016 at 09:24, Phil Archer <phila@w3.org> wrote:
>>>
>>> Bill's comments also made me think about some of the classic arguments,
>>> such as that a lake doesn't have a last updated date and isn't 435KB
>>> big. Which are true, however, that kind of metadata generally comes from
>>> the server, i.e. the HTTP layer. That's an over simplification but the
>>> point is that it is relatively easy to avoid deliberately creating
>>> misleading metadata - metadata about the doc rather than the thing it
>>> describes - and it's also generally easy to avoid looking for that
>>> metadata.
>>>
>>> Is there scope for some BP advice there?
>>>
>>> Phil.
>>>
>>> On 24/08/2016 08:25, Jeremy Tandy wrote:
>>> > Thanks Linda. More clear examples where being "correct" (in terms of
>>> > avoiding uri collisions by using two distinct uris) is making things
>>> worse
>>> > because users take the wrong one!
>>> >
>>> > So, as a WG, are we content to recommend this "indirect identification"
>>> > pattern where thing & info resource identifiers are conflated?
>>> >
>>> > Bill has added some good points about how to avoid impacts of uri
>>> > collision- by using the (dataset) metadata to talk about licenses and
>>> > creators for the information ...
>>> > On Wed, 24 Aug 2016 at 07:52, Linda van den Brink <
>>> l.vandenbrink@geonovum.nl>
>>> > wrote:
>>> >
>>> >> Experience from the Netherlands: we have the id/doc pattern in our URI
>>> >> strategy, based on the Cool URIs note [8] and the ISA study on
>>> persistent
>>> >> identifiers [9].
>>> >>
>>> >>
>>> >>
>>> >> That being said, same as Bill I also notice data users getting
>>> confused
>>> >> and generally using the /doc/  URI as that is the one they can copy
>>> from
>>> >> their browser address bar. This is not only casual confusion but also
>>> ends
>>> >> up in published information resources.
>>> >>
>>> >>
>>> >>
>>> >> You see this, for example, all over the CB-NL which is a vocabulary
>>> for
>>> >> the building sector and contains links to other Dutch standards such
>>> as
>>> >> IMGeo, an information model and vocabulary for large scale
>>> topography. E.g.
>>> >> the CB-NL concept of ‘Gebouw’ (Building) [10]  links to two IMGeo
>>> concepts
>>> >> ‘Pand’ (building part) and ‘Overig Bouwwerk’ (other construction)
>>> using
>>> >> their /doc/ URIs. If you click on Pand (which doesn’t have its own
>>> landing
>>> >> page in CB-NL so I can’t include the link) you will see it includes
>>> the
>>> >> /doc/  URI as the identifier of Pand.
>>> >>
>>> >>
>>> >>
>>> >> This is an example where it occurs in vocabularies, but I also see it
>>> >> happen with identifiers for data instances.
>>> >>
>>> >>
>>> >>
>>> >> [8]: https://www.w3.org/TR/cooluris/
>>> >>
>>> >> [9]:
>>> >> https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-
>>> %20Study%20on%20persistent%20URIs_0.pdf
>>> >> 10: http://ont.cbnl.org/cb/def/Gebouw
>>> >>
>>> >>
>>> >>
>>> >> Linda
>>> >>
>>> >>
>>> >>
>>> >> *Van:* Jeremy Tandy [mailto:jeremy.tandy@gmail.com]
>>> >> *Verzonden:* dinsdag 23 augustus 2016 20:57
>>> >> *Aan:* Bill Roberts
>>> >> *CC:* SDW WG Public List
>>> >> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for
>>> spatial
>>> >> things"
>>> >>
>>> >>
>>> >>
>>> >> Thanks Bill. Sounds very coherent ... I hoped for some responses such
>>> as
>>> >> this based on practical experience. Jeremy
>>> >>
>>> >> On Tue, 23 Aug 2016 at 19:41, Bill Roberts <bill@swirrl.com> wrote:
>>> >>
>>> >> ah Jeremy, you are a brave man to poke the sleeping beast of
>>> httpRange-14.
>>> >>
>>> >>
>>> >>
>>> >> But I'll get my thoughts in early, then I can tune out of the ensuing
>>> mail
>>> >> avalanche :-)
>>> >>
>>> >>
>>> >>
>>> >> When publishing Linked Data about places we (at Swirrl) generally do
>>> the
>>> >> id/doc fandango, but to be honest I think data users either don't
>>> notice,
>>> >> or they get confused by it.  In the applications we are working with
>>> (and I
>>> >> acknowledge that others may have different applications and different
>>> >> experiences), it wouldn't cause any problems to have a single URI,
>>> the 'id'
>>> >> URI if you like.  We just don't find a need to say anything about the
>>> /doc/
>>> >> URI.  If we were starting again, I'd probably ditch the /doc/ and the
>>> 303
>>> >> and rely on context and a little bit of documentation to make it
>>> clear what
>>> >> we mean.
>>> >>
>>> >>
>>> >>
>>> >> The place where we find a need to talk about creators and licences and
>>> >> modified dates is in metadata about datasets where a dataset might be
>>> a
>>> >> collection of information about a bunch of places - and we treat
>>> datasets
>>> >> as an 'information resource'. If someone requests a dataset URI we
>>> return a
>>> >> status code of 200 and the dataset metadata as the response.  That
>>> metadata
>>> >> includes info on where to get all the contents of the dataset if you
>>> want
>>> >> that.
>>> >>
>>> >>
>>> >>
>>> >> By the way, though it's sensible and consistent, I find that the
>>> implied
>>> >> and parallel property stuff makes it more rather than less
>>> complicated.
>>> >>
>>> >>
>>> >>
>>> >> Bill
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On 23 August 2016 at 17:37, Jeremy Tandy <jeremy.tandy@gmail.com>
>>> wrote:
>>> >>
>>> >> All-
>>> >>
>>> >>
>>> >>
>>> >> Linda has done a great job of consolidating the best practices are
>>> use of
>>> >> identifiers. We have just one [1] now.
>>> >>
>>> >>
>>> >>
>>> >> Reading though just now, it occurred to me that there's still an open
>>> >> issue about identifier assignment ...
>>> >>
>>> >>
>>> >>
>>> >> W3C's Architecture of the World Wide Web constraint "URIs identify a
>>> >> single resource" [2] asserts "Assign distinct URIs to distinct
>>> resources"
>>> >> in order to avoid URI collisions [2a] which "often imposes a cost in
>>> >> communication due to the effort required to resolve ambiguities".
>>> >> Discussions from earlier years in UK Gov Linked Data working group
>>> (and
>>> >> elsewhere) concluded that the "real world thing" and "information
>>> resource
>>> >> that describes the real world thing" are separate resources. I think
>>> this
>>> >> is based on a (purist?) view when working with RDF of needing to be
>>> totally
>>> >> clear on "what's the subject" of each triple ... the thing or the
>>> document.
>>> >> This manifests as URIs with `id` or `doc` included somewhere to
>>> distinguish
>>> >> between the resources and some RDF triples to clarify that the doc
>>> resource
>>> >> is talking about the thing resource etc..
>>> >>
>>> >>
>>> >>
>>> >> (dangerously close to "httpRange-14" [3] here ... let's avoid that
>>> bear
>>> >> trap)
>>> >>
>>> >>
>>> >>
>>> >> Jeni Tennison's "URLs in Data Primer" draft TAG note captures this
>>> >> practice in §5.3 "Publishing data" [4]:
>>> >>
>>> >>
>>> >>
>>> >> ```
>>> >>
>>> >> Publishers can help enable more accurate merging of data from
>>> different
>>> >> sites if they support URLs for each entity
>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-entity> they or other sites
>>> may
>>> >> wish to describe, separate from the landing pages
>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> or records
>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-record> that they publish.
>>> >>
>>> >> ```
>>> >>
>>> >>
>>> >>
>>> >> Yet Architecture of the World Wide Web §2.2.3 "Indirect
>>> identification"
>>> >> [5] notes that:
>>> >>
>>> >>
>>> >>
>>> >> ```
>>> >>
>>> >> To say that the URI "mailto:nadia@example.com" identifies both an
>>> >> Internet mailbox and Nadia, the person, introduces a URI collision.
>>> >> However, we can use the URI to indirectly identify Nadia. Identifiers
>>> are
>>> >> commonly used in this way.
>>> >>
>>> >> ```
>>> >>
>>> >>
>>> >>
>>> >> This is consistent with what I recall TimBL saying at TPAC-2015 in
>>> regards
>>> >> to Vcard; come the finish, no one really cares to distinguish between
>>> the
>>> >> thing and its associated information resource.
>>> >>
>>> >>
>>> >>
>>> >> ... And in most cases, one can use context to determine whether a
>>> >> statement concerns the thing or the information resource. In those
>>> cases
>>> >> where you can't, "URLs in Data Primer" suggests some mechanisms to
>>> mitigate
>>> >> such confusion [6][7].
>>> >>
>>> >>
>>> >>
>>> >> I think that in our SDW WG discussion we have concluded that we _are_
>>> >> content to use "indirect identification" - e.g. that we use URIs that
>>> >> conflate the thing and document resource.
>>> >>
>>> >>
>>> >>
>>> >> Please can we confirm this? Assuming that indirect identification is
>>> >> "approved" as best practice, then it seems prudent to add a note to
>>> the BP
>>> >> document saying "don't worry about distinguishing between thing and
>>> >> resource; indirect identification is fine" (etc.)
>>> >>
>>> >>
>>> >>
>>> >> Thanks, Jeremy
>>> >>
>>> >>
>>> >>
>>> >> [1]: http://w3c.github.io/sdw/bp/#globally-unique-ids
>>> >>
>>> >> [2]: https://www.w3.org/TR/webarch/#pr-uri-collision
>>> >>
>>> >> [2a]: https://www.w3.org/TR/webarch/#URI-collision
>>> >>
>>> >> [3]: https://www.w3.org/2001/tag/group/track/issues/14
>>> >>
>>> >> [4]: https://www.w3.org/TR/urls-in-data/#publishing-data
>>> >>
>>> >> [5]: https://www.w3.org/TR/webarch/#indirect-identification
>>> >>
>>> >> [6]: https://www.w3.org/TR/urls-in-data/#documenting-properties
>>> >>
>>> >> [7]: https://www.w3.org/TR/urls-in-data/#authoring-specifications
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>>
>>> --
>>>
>>>
>>> Phil Archer
>>> W3C Data Activity Lead
>>> http://www.w3.org/2013/data/
>>>
>>> http://philarcher.org
>>> +44 (0)7887 767755
>>> @philarcher1
>>>
>>>
Received on Wednesday, 31 August 2016 11:29:43 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:31:25 UTC