- From: Jeremy Tandy <jeremy.tandy@gmail.com>
- Date: Wed, 31 Aug 2016 07:50:26 +0000
- To: Clemens Portele <portele@interactive-instruments.de>, Rob Atkinson <rob@metalinkage.com.au>, Phil Archer <phila@w3.org>, Linda van den Brink <l.vandenbrink@geonovum.nl>, Bill Roberts <bill@swirrl.com>
- Cc: SDW WG Public List <public-sdw-wg@w3.org>
- Message-ID: <CADtUq_2dT8S-VnMUf5wjDGsQXJuPiOv20Oi5UviwNQFF7RGSqA@mail.gmail.com>
Thanks Rob & Clemens ... On Wed, 31 Aug 2016 at 08:30, Clemens Portele < portele@interactive-instruments.de> wrote: > +1 > > > On 30 August 2016 at 10:10:26, Jeremy Tandy (jeremy.tandy@gmail.com) > wrote: > > Hi. It would be good to close this issue out & include our collective > recommendation in the BP doc working draft. > > PROPOSAL: SDW working group recommends use of "indirect identifiers" for > spatial things > > ... I'll start the voting. > > +1 > > Jeremy > > (BTW, to make sense of the PROPOSAL you'll need to read the email thread) > > On Fri, 26 Aug 2016 at 10:12 Linda van den Brink < > l.vandenbrink@geonovum.nl> wrote: > >> So… do we agree we can recommend indirect identifiers, or do we try to >> fix the issue with getting the correct identifier as Rob describes? >> >> >> >> While waiting for this I’ve updated the issue and the text referring to >> the issue in BP6. >> >> >> >> *Van:* Rob Atkinson [mailto:rob@metalinkage.com.au] >> *Verzonden:* woensdag 24 augustus 2016 13:56 >> *Aan:* Jeremy Tandy; Phil Archer; Linda van den Brink; Bill Roberts >> >> >> *CC:* SDW WG Public List >> >> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for spatial >> things" >> >> >> >> Hi >> >> >> >> Agree this is a real concern - people cant be blamed for doing the >> obvious, if dumb, thing.. >> >> >> >> I think we should take note of best practice in the HTML world - which is >> often to include a citable link to a resource in the rendered view. Or a >> "share" or something similar. We can also put fairly explicit annotation in >> machine-readable code - stating that the resource is about the URI - and >> even notes saying when citing this resource use the URI.... >> >> >> >> I'd also like to see browsers evolve to offer you the original link or >> the redirected when cutting and pasting - how hard can it be! >> >> >> >> Maybe we can get Ed to ask around Google Chrome team for suggestions on >> how best to handle this :-) >> >> >> >> Rob >> >> >> >> >> >> >> >> On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy <jeremy.tandy@gmail.com> wrote: >> >> Yes, I think so ... And we should do so if we are recommending "indirect >> identification". >> >> Jeremy >> >> On Wed, 24 Aug 2016 at 09:24, Phil Archer <phila@w3.org> wrote: >> >> Bill's comments also made me think about some of the classic arguments, >> such as that a lake doesn't have a last updated date and isn't 435KB >> big. Which are true, however, that kind of metadata generally comes from >> the server, i.e. the HTTP layer. That's an over simplification but the >> point is that it is relatively easy to avoid deliberately creating >> misleading metadata - metadata about the doc rather than the thing it >> describes - and it's also generally easy to avoid looking for that >> metadata. >> >> Is there scope for some BP advice there? >> >> Phil. >> >> On 24/08/2016 08:25, Jeremy Tandy wrote: >> > Thanks Linda. More clear examples where being "correct" (in terms of >> > avoiding uri collisions by using two distinct uris) is making things >> worse >> > because users take the wrong one! >> > >> > So, as a WG, are we content to recommend this "indirect identification" >> > pattern where thing & info resource identifiers are conflated? >> > >> > Bill has added some good points about how to avoid impacts of uri >> > collision- by using the (dataset) metadata to talk about licenses and >> > creators for the information ... >> > On Wed, 24 Aug 2016 at 07:52, Linda van den Brink < >> l.vandenbrink@geonovum.nl> >> > wrote: >> > >> >> Experience from the Netherlands: we have the id/doc pattern in our URI >> >> strategy, based on the Cool URIs note [8] and the ISA study on >> persistent >> >> identifiers [9]. >> >> >> >> >> >> >> >> That being said, same as Bill I also notice data users getting confused >> >> and generally using the /doc/ URI as that is the one they can copy >> from >> >> their browser address bar. This is not only casual confusion but also >> ends >> >> up in published information resources. >> >> >> >> >> >> >> >> You see this, for example, all over the CB-NL which is a vocabulary for >> >> the building sector and contains links to other Dutch standards such as >> >> IMGeo, an information model and vocabulary for large scale topography. >> E.g. >> >> the CB-NL concept of ‘Gebouw’ (Building) [10] links to two IMGeo >> concepts >> >> ‘Pand’ (building part) and ‘Overig Bouwwerk’ (other construction) using >> >> their /doc/ URIs. If you click on Pand (which doesn’t have its own >> landing >> >> page in CB-NL so I can’t include the link) you will see it includes the >> >> /doc/ URI as the identifier of Pand. >> >> >> >> >> >> >> >> This is an example where it occurs in vocabularies, but I also see it >> >> happen with identifiers for data instances. >> >> >> >> >> >> >> >> [8]: https://www.w3.org/TR/cooluris/ >> >> >> >> [9]: >> >> >> https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%20Study%20on%20persistent%20URIs_0.pdf >> >> 10: http://ont.cbnl.org/cb/def/Gebouw >> >> >> >> >> >> >> >> Linda >> >> >> >> >> >> >> >> *Van:* Jeremy Tandy [mailto:jeremy.tandy@gmail.com] >> >> *Verzonden:* dinsdag 23 augustus 2016 20:57 >> >> *Aan:* Bill Roberts >> >> *CC:* SDW WG Public List >> >> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for spatial >> >> things" >> >> >> >> >> >> >> >> Thanks Bill. Sounds very coherent ... I hoped for some responses such >> as >> >> this based on practical experience. Jeremy >> >> >> >> On Tue, 23 Aug 2016 at 19:41, Bill Roberts <bill@swirrl.com> wrote: >> >> >> >> ah Jeremy, you are a brave man to poke the sleeping beast of >> httpRange-14. >> >> >> >> >> >> >> >> But I'll get my thoughts in early, then I can tune out of the ensuing >> mail >> >> avalanche :-) >> >> >> >> >> >> >> >> When publishing Linked Data about places we (at Swirrl) generally do >> the >> >> id/doc fandango, but to be honest I think data users either don't >> notice, >> >> or they get confused by it. In the applications we are working with >> (and I >> >> acknowledge that others may have different applications and different >> >> experiences), it wouldn't cause any problems to have a single URI, the >> 'id' >> >> URI if you like. We just don't find a need to say anything about the >> /doc/ >> >> URI. If we were starting again, I'd probably ditch the /doc/ and the >> 303 >> >> and rely on context and a little bit of documentation to make it clear >> what >> >> we mean. >> >> >> >> >> >> >> >> The place where we find a need to talk about creators and licences and >> >> modified dates is in metadata about datasets where a dataset might be a >> >> collection of information about a bunch of places - and we treat >> datasets >> >> as an 'information resource'. If someone requests a dataset URI we >> return a >> >> status code of 200 and the dataset metadata as the response. That >> metadata >> >> includes info on where to get all the contents of the dataset if you >> want >> >> that. >> >> >> >> >> >> >> >> By the way, though it's sensible and consistent, I find that the >> implied >> >> and parallel property stuff makes it more rather than less complicated. >> >> >> >> >> >> >> >> Bill >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 23 August 2016 at 17:37, Jeremy Tandy <jeremy.tandy@gmail.com> >> wrote: >> >> >> >> All- >> >> >> >> >> >> >> >> Linda has done a great job of consolidating the best practices are use >> of >> >> identifiers. We have just one [1] now. >> >> >> >> >> >> >> >> Reading though just now, it occurred to me that there's still an open >> >> issue about identifier assignment ... >> >> >> >> >> >> >> >> W3C's Architecture of the World Wide Web constraint "URIs identify a >> >> single resource" [2] asserts "Assign distinct URIs to distinct >> resources" >> >> in order to avoid URI collisions [2a] which "often imposes a cost in >> >> communication due to the effort required to resolve ambiguities". >> >> Discussions from earlier years in UK Gov Linked Data working group (and >> >> elsewhere) concluded that the "real world thing" and "information >> resource >> >> that describes the real world thing" are separate resources. I think >> this >> >> is based on a (purist?) view when working with RDF of needing to be >> totally >> >> clear on "what's the subject" of each triple ... the thing or the >> document. >> >> This manifests as URIs with `id` or `doc` included somewhere to >> distinguish >> >> between the resources and some RDF triples to clarify that the doc >> resource >> >> is talking about the thing resource etc.. >> >> >> >> >> >> >> >> (dangerously close to "httpRange-14" [3] here ... let's avoid that bear >> >> trap) >> >> >> >> >> >> >> >> Jeni Tennison's "URLs in Data Primer" draft TAG note captures this >> >> practice in §5.3 "Publishing data" [4]: >> >> >> >> >> >> >> >> ``` >> >> >> >> Publishers can help enable more accurate merging of data from different >> >> sites if they support URLs for each entity >> >> <https://www.w3.org/TR/urls-in-data/#dfn-entity> they or other sites >> may >> >> wish to describe, separate from the landing pages >> >> <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> or records >> >> <https://www.w3.org/TR/urls-in-data/#dfn-record> that they publish. >> >> >> >> ``` >> >> >> >> >> >> >> >> Yet Architecture of the World Wide Web §2.2.3 "Indirect identification" >> >> [5] notes that: >> >> >> >> >> >> >> >> ``` >> >> >> >> To say that the URI "mailto:nadia@example.com" identifies both an >> >> Internet mailbox and Nadia, the person, introduces a URI collision. >> >> However, we can use the URI to indirectly identify Nadia. Identifiers >> are >> >> commonly used in this way. >> >> >> >> ``` >> >> >> >> >> >> >> >> This is consistent with what I recall TimBL saying at TPAC-2015 in >> regards >> >> to Vcard; come the finish, no one really cares to distinguish between >> the >> >> thing and its associated information resource. >> >> >> >> >> >> >> >> ... And in most cases, one can use context to determine whether a >> >> statement concerns the thing or the information resource. In those >> cases >> >> where you can't, "URLs in Data Primer" suggests some mechanisms to >> mitigate >> >> such confusion [6][7]. >> >> >> >> >> >> >> >> I think that in our SDW WG discussion we have concluded that we _are_ >> >> content to use "indirect identification" - e.g. that we use URIs that >> >> conflate the thing and document resource. >> >> >> >> >> >> >> >> Please can we confirm this? Assuming that indirect identification is >> >> "approved" as best practice, then it seems prudent to add a note to >> the BP >> >> document saying "don't worry about distinguishing between thing and >> >> resource; indirect identification is fine" (etc.) >> >> >> >> >> >> >> >> Thanks, Jeremy >> >> >> >> >> >> >> >> [1]: http://w3c.github.io/sdw/bp/#globally-unique-ids >> >> >> >> [2]: https://www.w3.org/TR/webarch/#pr-uri-collision >> >> >> >> [2a]: https://www.w3.org/TR/webarch/#URI-collision >> >> >> >> [3]: https://www.w3.org/2001/tag/group/track/issues/14 >> >> >> >> [4]: https://www.w3.org/TR/urls-in-data/#publishing-data >> >> >> >> [5]: https://www.w3.org/TR/webarch/#indirect-identification >> >> >> >> [6]: https://www.w3.org/TR/urls-in-data/#documenting-properties >> >> >> >> [7]: https://www.w3.org/TR/urls-in-data/#authoring-specifications >> >> >> >> >> >> >> >> >> > >> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 >> @philarcher1 >> >>
Received on Wednesday, 31 August 2016 07:51:09 UTC