Re: Clarification required: BP6 "use HTTP URIs for spatial things"

On 31 August 2016 at 13:42, Joshua Lieberman <jlieberman@tumblingwalls.com>
wrote:

> If we are asserting that spatial data on the Web is "always" feature data
> that represents a real world entity, then yes, we don't have the general
> Web "is it or isn't it physical" ambiguity and can assume that a feature
> data identifier also and indirectly identifies the feature.
>

I hope we can broaden that assumption, that the assertion still holds even
if we are not talking about feature data representing real world entities.

Let's look at a border case: I am drawing a triangle in Inkscape and I save
it as a *.svg file. I publish the file on the web, so it has a URI. Now I
would say the triangle is a spatial thing (not sure if it counts as a real
world entity, but I hope we can leave the idea of 'real world' out of
definitions anyway). The SVG object in the file is the geometry describing
the spatial thing. I think that only if we understand the SVG file to be
the spatial thing we get into trouble. I might want to state that the file
has a certain size and that the triangle has a certain area. It would be
funny if I used the same URI for both statements. So I would need to have a
different URI for my triangle. Could that be all?


> That still leaves a gap in expressing whether two feature data entities
> represent the same real world entity. Perhaps we need a "sameFeatureAs"
> predicate to address this.
>

Yes, that is what the Subject equality
<http://w3c.github.io/sdw/UseCases/SDWUseCasesAndRequirements.html#SubjectEquality>
requirement is about. So the BP document is expected to say something about
that.

Regards,
Frans


> Josh
>
> Joshua Lieberman, Ph.D.
> Principal, Tumbling Walls Consultancy
> Tel/Direct: +1 617-431-6431
> jlieberman@tumblingwalls.com
>
> On Aug 31, 2016, at 07:29, Frans Knibbe <frans.knibbe@geodan.nl> wrote:
>
> Hello,
>
> As stated before, I don't think the httpRange-14 problem exists in our
> domain of discourse. I think (and hope) that confusion can only occur when
> the things that are described are digital things, or things that can be
> transmitted over a computer network, like web pages or mail boxes. It seems
> to me that spatial things are never that type of thing. Therefore there is
> no reason to take precautions against possible confusion.
>
> That probably means +1.
>
> Greetings,
> Frans
>
>
>
> On 31 August 2016 at 09:50, Jeremy Tandy <jeremy.tandy@gmail.com> wrote:
>
>> Thanks Rob & Clemens ...
>>
>> On Wed, 31 Aug 2016 at 08:30, Clemens Portele <
>> portele@interactive-instruments.de> wrote:
>>
>>> +1
>>>
>>>
>>> On 30 August 2016 at 10:10:26, Jeremy Tandy (jeremy.tandy@gmail.com)
>>> wrote:
>>>
>>> Hi. It would be good to close this issue out & include our collective
>>> recommendation in the BP doc working draft.
>>>
>>> PROPOSAL: SDW working group recommends use of "indirect identifiers" for
>>> spatial things
>>>
>>> ... I'll start the voting.
>>>
>>> +1
>>>
>>> Jeremy
>>>
>>> (BTW, to make sense of the PROPOSAL you'll need to read the email thread)
>>>
>>> On Fri, 26 Aug 2016 at 10:12 Linda van den Brink <
>>> l.vandenbrink@geonovum.nl> wrote:
>>>
>>>> So… do we agree we can recommend indirect identifiers, or do we try to
>>>> fix the issue with getting the correct identifier as Rob describes?
>>>>
>>>>
>>>>
>>>> While waiting for this I’ve updated the issue and the text referring to
>>>> the issue in BP6.
>>>>
>>>>
>>>>
>>>> *Van:* Rob Atkinson [mailto:rob@metalinkage.com.au]
>>>> *Verzonden:* woensdag 24 augustus 2016 13:56
>>>> *Aan:* Jeremy Tandy; Phil Archer; Linda van den Brink; Bill Roberts
>>>>
>>>>
>>>> *CC:* SDW WG Public List
>>>>
>>>> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for
>>>> spatial things"
>>>>
>>>>
>>>>
>>>> Hi
>>>>
>>>>
>>>>
>>>> Agree this is a real concern - people cant be blamed for doing the
>>>> obvious, if dumb, thing..
>>>>
>>>>
>>>>
>>>> I think we should take note of best practice in the HTML world - which
>>>> is often to include a citable link to a resource in the rendered view.  Or
>>>> a "share" or something similar. We can also put fairly explicit annotation
>>>> in machine-readable code - stating that the resource is about the URI - and
>>>> even notes saying when citing this resource use the URI....
>>>>
>>>>
>>>>
>>>> I'd also like to see browsers evolve to offer you the original link or
>>>> the redirected when cutting and pasting - how hard can it be!
>>>>
>>>>
>>>>
>>>> Maybe we can get Ed to ask around Google Chrome team for suggestions on
>>>> how best to handle this :-)
>>>>
>>>>
>>>>
>>>> Rob
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy <jeremy.tandy@gmail.com>
>>>> wrote:
>>>>
>>>> Yes, I think so ... And we should do so if we are recommending
>>>> "indirect identification".
>>>>
>>>> Jeremy
>>>>
>>>> On Wed, 24 Aug 2016 at 09:24, Phil Archer <phila@w3.org> wrote:
>>>>
>>>> Bill's comments also made me think about some of the classic arguments,
>>>> such as that a lake doesn't have a last updated date and isn't 435KB
>>>> big. Which are true, however, that kind of metadata generally comes from
>>>> the server, i.e. the HTTP layer. That's an over simplification but the
>>>> point is that it is relatively easy to avoid deliberately creating
>>>> misleading metadata - metadata about the doc rather than the thing it
>>>> describes - and it's also generally easy to avoid looking for that
>>>> metadata.
>>>>
>>>> Is there scope for some BP advice there?
>>>>
>>>> Phil.
>>>>
>>>> On 24/08/2016 08:25, Jeremy Tandy wrote:
>>>> > Thanks Linda. More clear examples where being "correct" (in terms of
>>>> > avoiding uri collisions by using two distinct uris) is making things
>>>> worse
>>>> > because users take the wrong one!
>>>> >
>>>> > So, as a WG, are we content to recommend this "indirect
>>>> identification"
>>>> > pattern where thing & info resource identifiers are conflated?
>>>> >
>>>> > Bill has added some good points about how to avoid impacts of uri
>>>> > collision- by using the (dataset) metadata to talk about licenses and
>>>> > creators for the information ...
>>>> > On Wed, 24 Aug 2016 at 07:52, Linda van den Brink <
>>>> l.vandenbrink@geonovum.nl>
>>>> > wrote:
>>>> >
>>>> >> Experience from the Netherlands: we have the id/doc pattern in our
>>>> URI
>>>> >> strategy, based on the Cool URIs note [8] and the ISA study on
>>>> persistent
>>>> >> identifiers [9].
>>>> >>
>>>> >>
>>>> >>
>>>> >> That being said, same as Bill I also notice data users getting
>>>> confused
>>>> >> and generally using the /doc/  URI as that is the one they can copy
>>>> from
>>>> >> their browser address bar. This is not only casual confusion but
>>>> also ends
>>>> >> up in published information resources.
>>>> >>
>>>> >>
>>>> >>
>>>> >> You see this, for example, all over the CB-NL which is a vocabulary
>>>> for
>>>> >> the building sector and contains links to other Dutch standards such
>>>> as
>>>> >> IMGeo, an information model and vocabulary for large scale
>>>> topography. E.g.
>>>> >> the CB-NL concept of ‘Gebouw’ (Building) [10]  links to two IMGeo
>>>> concepts
>>>> >> ‘Pand’ (building part) and ‘Overig Bouwwerk’ (other construction)
>>>> using
>>>> >> their /doc/ URIs. If you click on Pand (which doesn’t have its own
>>>> landing
>>>> >> page in CB-NL so I can’t include the link) you will see it includes
>>>> the
>>>> >> /doc/  URI as the identifier of Pand.
>>>> >>
>>>> >>
>>>> >>
>>>> >> This is an example where it occurs in vocabularies, but I also see it
>>>> >> happen with identifiers for data instances.
>>>> >>
>>>> >>
>>>> >>
>>>> >> [8]: https://www.w3.org/TR/cooluris/
>>>> >>
>>>> >> [9]:
>>>> >> https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%2
>>>> 0Study%20on%20persistent%20URIs_0.pdf
>>>> >> 10: http://ont.cbnl.org/cb/def/Gebouw
>>>> >>
>>>> >>
>>>> >>
>>>> >> Linda
>>>> >>
>>>> >>
>>>> >>
>>>> >> *Van:* Jeremy Tandy [mailto:jeremy.tandy@gmail.com]
>>>> >> *Verzonden:* dinsdag 23 augustus 2016 20:57
>>>> >> *Aan:* Bill Roberts
>>>> >> *CC:* SDW WG Public List
>>>> >> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for
>>>> spatial
>>>> >> things"
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks Bill. Sounds very coherent ... I hoped for some responses
>>>> such as
>>>> >> this based on practical experience. Jeremy
>>>> >>
>>>> >> On Tue, 23 Aug 2016 at 19:41, Bill Roberts <bill@swirrl.com> wrote:
>>>> >>
>>>> >> ah Jeremy, you are a brave man to poke the sleeping beast of
>>>> httpRange-14.
>>>> >>
>>>> >>
>>>> >>
>>>> >> But I'll get my thoughts in early, then I can tune out of the
>>>> ensuing mail
>>>> >> avalanche :-)
>>>> >>
>>>> >>
>>>> >>
>>>> >> When publishing Linked Data about places we (at Swirrl) generally do
>>>> the
>>>> >> id/doc fandango, but to be honest I think data users either don't
>>>> notice,
>>>> >> or they get confused by it.  In the applications we are working with
>>>> (and I
>>>> >> acknowledge that others may have different applications and different
>>>> >> experiences), it wouldn't cause any problems to have a single URI,
>>>> the 'id'
>>>> >> URI if you like.  We just don't find a need to say anything about
>>>> the /doc/
>>>> >> URI.  If we were starting again, I'd probably ditch the /doc/ and
>>>> the 303
>>>> >> and rely on context and a little bit of documentation to make it
>>>> clear what
>>>> >> we mean.
>>>> >>
>>>> >>
>>>> >>
>>>> >> The place where we find a need to talk about creators and licences
>>>> and
>>>> >> modified dates is in metadata about datasets where a dataset might
>>>> be a
>>>> >> collection of information about a bunch of places - and we treat
>>>> datasets
>>>> >> as an 'information resource'. If someone requests a dataset URI we
>>>> return a
>>>> >> status code of 200 and the dataset metadata as the response.  That
>>>> metadata
>>>> >> includes info on where to get all the contents of the dataset if you
>>>> want
>>>> >> that.
>>>> >>
>>>> >>
>>>> >>
>>>> >> By the way, though it's sensible and consistent, I find that the
>>>> implied
>>>> >> and parallel property stuff makes it more rather than less
>>>> complicated.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Bill
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On 23 August 2016 at 17:37, Jeremy Tandy <jeremy.tandy@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> All-
>>>> >>
>>>> >>
>>>> >>
>>>> >> Linda has done a great job of consolidating the best practices are
>>>> use of
>>>> >> identifiers. We have just one [1] now.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Reading though just now, it occurred to me that there's still an open
>>>> >> issue about identifier assignment ...
>>>> >>
>>>> >>
>>>> >>
>>>> >> W3C's Architecture of the World Wide Web constraint "URIs identify a
>>>> >> single resource" [2] asserts "Assign distinct URIs to distinct
>>>> resources"
>>>> >> in order to avoid URI collisions [2a] which "often imposes a cost in
>>>> >> communication due to the effort required to resolve ambiguities".
>>>> >> Discussions from earlier years in UK Gov Linked Data working group
>>>> (and
>>>> >> elsewhere) concluded that the "real world thing" and "information
>>>> resource
>>>> >> that describes the real world thing" are separate resources. I think
>>>> this
>>>> >> is based on a (purist?) view when working with RDF of needing to be
>>>> totally
>>>> >> clear on "what's the subject" of each triple ... the thing or the
>>>> document.
>>>> >> This manifests as URIs with `id` or `doc` included somewhere to
>>>> distinguish
>>>> >> between the resources and some RDF triples to clarify that the doc
>>>> resource
>>>> >> is talking about the thing resource etc..
>>>> >>
>>>> >>
>>>> >>
>>>> >> (dangerously close to "httpRange-14" [3] here ... let's avoid that
>>>> bear
>>>> >> trap)
>>>> >>
>>>> >>
>>>> >>
>>>> >> Jeni Tennison's "URLs in Data Primer" draft TAG note captures this
>>>> >> practice in §5.3 "Publishing data" [4]:
>>>> >>
>>>> >>
>>>> >>
>>>> >> ```
>>>> >>
>>>> >> Publishers can help enable more accurate merging of data from
>>>> different
>>>> >> sites if they support URLs for each entity
>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-entity> they or other
>>>> sites may
>>>> >> wish to describe, separate from the landing pages
>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> or records
>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-record> that they publish.
>>>> >>
>>>> >> ```
>>>> >>
>>>> >>
>>>> >>
>>>> >> Yet Architecture of the World Wide Web §2.2.3 "Indirect
>>>> identification"
>>>> >> [5] notes that:
>>>> >>
>>>> >>
>>>> >>
>>>> >> ```
>>>> >>
>>>> >> To say that the URI "mailto:nadia@example.com" identifies both an
>>>> >> Internet mailbox and Nadia, the person, introduces a URI collision.
>>>> >> However, we can use the URI to indirectly identify Nadia.
>>>> Identifiers are
>>>> >> commonly used in this way.
>>>> >>
>>>> >> ```
>>>> >>
>>>> >>
>>>> >>
>>>> >> This is consistent with what I recall TimBL saying at TPAC-2015 in
>>>> regards
>>>> >> to Vcard; come the finish, no one really cares to distinguish
>>>> between the
>>>> >> thing and its associated information resource.
>>>> >>
>>>> >>
>>>> >>
>>>> >> ... And in most cases, one can use context to determine whether a
>>>> >> statement concerns the thing or the information resource. In those
>>>> cases
>>>> >> where you can't, "URLs in Data Primer" suggests some mechanisms to
>>>> mitigate
>>>> >> such confusion [6][7].
>>>> >>
>>>> >>
>>>> >>
>>>> >> I think that in our SDW WG discussion we have concluded that we _are_
>>>> >> content to use "indirect identification" - e.g. that we use URIs that
>>>> >> conflate the thing and document resource.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Please can we confirm this? Assuming that indirect identification is
>>>> >> "approved" as best practice, then it seems prudent to add a note to
>>>> the BP
>>>> >> document saying "don't worry about distinguishing between thing and
>>>> >> resource; indirect identification is fine" (etc.)
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks, Jeremy
>>>> >>
>>>> >>
>>>> >>
>>>> >> [1]: http://w3c.github.io/sdw/bp/#globally-unique-ids
>>>> >>
>>>> >> [2]: https://www.w3.org/TR/webarch/#pr-uri-collision
>>>> >>
>>>> >> [2a]: https://www.w3.org/TR/webarch/#URI-collision
>>>> >>
>>>> >> [3]: https://www.w3.org/2001/tag/group/track/issues/14
>>>> >>
>>>> >> [4]: https://www.w3.org/TR/urls-in-data/#publishing-data
>>>> >>
>>>> >> [5]: https://www.w3.org/TR/webarch/#indirect-identification
>>>> >>
>>>> >> [6]: https://www.w3.org/TR/urls-in-data/#documenting-properties
>>>> >>
>>>> >> [7]: https://www.w3.org/TR/urls-in-data/#authoring-specifications
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>> --
>>>>
>>>>
>>>> Phil Archer
>>>> W3C Data Activity Lead
>>>> http://www.w3.org/2013/data/
>>>>
>>>> http://philarcher.org
>>>> +44 (0)7887 767755
>>>> @philarcher1
>>>>
>>>>
>

Received on Wednesday, 31 August 2016 14:21:09 UTC