- From: Jeremy Tandy <jeremy.tandy@gmail.com>
- Date: Tue, 23 Aug 2016 18:57:23 +0000
- To: Bill Roberts <bill@swirrl.com>
- Cc: SDW WG Public List <public-sdw-wg@w3.org>
- Message-ID: <CADtUq_1g_ERSf4yrVxsBZ0E2Kun7VJ-itgzZuSrmpXLQaJ9HbQ@mail.gmail.com>
Thanks Bill. Sounds very coherent ... I hoped for some responses such as this based on practical experience. Jeremy On Tue, 23 Aug 2016 at 19:41, Bill Roberts <bill@swirrl.com> wrote: > ah Jeremy, you are a brave man to poke the sleeping beast of httpRange-14. > > But I'll get my thoughts in early, then I can tune out of the ensuing mail > avalanche :-) > > When publishing Linked Data about places we (at Swirrl) generally do the > id/doc fandango, but to be honest I think data users either don't notice, > or they get confused by it. In the applications we are working with (and I > acknowledge that others may have different applications and different > experiences), it wouldn't cause any problems to have a single URI, the 'id' > URI if you like. We just don't find a need to say anything about the /doc/ > URI. If we were starting again, I'd probably ditch the /doc/ and the 303 > and rely on context and a little bit of documentation to make it clear what > we mean. > > The place where we find a need to talk about creators and licences and > modified dates is in metadata about datasets where a dataset might be a > collection of information about a bunch of places - and we treat datasets > as an 'information resource'. If someone requests a dataset URI we return a > status code of 200 and the dataset metadata as the response. That metadata > includes info on where to get all the contents of the dataset if you want > that. > > By the way, though it's sensible and consistent, I find that the implied > and parallel property stuff makes it more rather than less complicated. > > Bill > > > > > > > On 23 August 2016 at 17:37, Jeremy Tandy <jeremy.tandy@gmail.com> wrote: > >> All- >> >> Linda has done a great job of consolidating the best practices are use of >> identifiers. We have just one [1] now. >> >> Reading though just now, it occurred to me that there's still an open >> issue about identifier assignment ... >> >> W3C's Architecture of the World Wide Web constraint "URIs identify a >> single resource" [2] asserts "Assign distinct URIs to distinct resources" >> in order to avoid URI collisions [2a] which "often imposes a cost in >> communication due to the effort required to resolve ambiguities". >> Discussions from earlier years in UK Gov Linked Data working group (and >> elsewhere) concluded that the "real world thing" and "information resource >> that describes the real world thing" are separate resources. I think this >> is based on a (purist?) view when working with RDF of needing to be totally >> clear on "what's the subject" of each triple ... the thing or the document. >> This manifests as URIs with `id` or `doc` included somewhere to distinguish >> between the resources and some RDF triples to clarify that the doc resource >> is talking about the thing resource etc.. >> >> (dangerously close to "httpRange-14" [3] here ... let's avoid that bear >> trap) >> >> Jeni Tennison's "URLs in Data Primer" draft TAG note captures this >> practice in §5.3 "Publishing data" [4]: >> >> ``` >> Publishers can help enable more accurate merging of data from different >> sites if they support URLs for each entity >> <https://www.w3.org/TR/urls-in-data/#dfn-entity> they or other sites may >> wish to describe, separate from the landing pages >> <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> or records >> <https://www.w3.org/TR/urls-in-data/#dfn-record> that they publish. >> ``` >> >> Yet Architecture of the World Wide Web §2.2.3 "Indirect identification" >> [5] notes that: >> >> ``` >> To say that the URI "mailto:nadia@example.com" identifies both an >> Internet mailbox and Nadia, the person, introduces a URI collision. >> However, we can use the URI to indirectly identify Nadia. Identifiers are >> commonly used in this way. >> ``` >> >> This is consistent with what I recall TimBL saying at TPAC-2015 in >> regards to Vcard; come the finish, no one really cares to distinguish >> between the thing and its associated information resource. >> >> ... And in most cases, one can use context to determine whether a >> statement concerns the thing or the information resource. In those cases >> where you can't, "URLs in Data Primer" suggests some mechanisms to mitigate >> such confusion [6][7]. >> >> I think that in our SDW WG discussion we have concluded that we _are_ >> content to use "indirect identification" - e.g. that we use URIs that >> conflate the thing and document resource. >> >> Please can we confirm this? Assuming that indirect identification is >> "approved" as best practice, then it seems prudent to add a note to the BP >> document saying "don't worry about distinguishing between thing and >> resource; indirect identification is fine" (etc.) >> >> Thanks, Jeremy >> >> [1]: http://w3c.github.io/sdw/bp/#globally-unique-ids >> [2]: https://www.w3.org/TR/webarch/#pr-uri-collision >> [2a]: https://www.w3.org/TR/webarch/#URI-collision >> [3]: https://www.w3.org/2001/tag/group/track/issues/14 >> [4]: https://www.w3.org/TR/urls-in-data/#publishing-data >> [5]: https://www.w3.org/TR/webarch/#indirect-identification >> [6]: https://www.w3.org/TR/urls-in-data/#documenting-properties >> [7]: https://www.w3.org/TR/urls-in-data/#authoring-specifications >> > >
Received on Tuesday, 23 August 2016 18:58:05 UTC