- From: Rob Atkinson <rob@metalinkage.com.au>
- Date: Mon, 05 Sep 2016 00:59:32 +0000
- To: Simon.Cox@csiro.au, rob@metalinkage.com.au, janowicz@ucsb.edu, frans.knibbe@geodan.nl
- Cc: jlieberman@tumblingwalls.com, jeremy.tandy@gmail.com, public-sdw-wg@w3.org
- Message-ID: <CACfF9LxzUYd4dRgFQAarzd8DpgZNAz=Hjh_i=A9RMEUT=479KA@mail.gmail.com>
I agree we shouldn't, but that's the sort of thing people do - and we can provide a BP to avoid it in favour of using indirect URIs, which then provides the option of declaring equivalence without stating nonsense. Rob On Mon, 5 Sep 2016 at 10:04 <Simon.Cox@csiro.au> wrote: > *Rob* wrote: > > > > Ø consider two resources with URIs R1 and R2 > > Ø > > Ø R1 ns:dateEdited 12/1/2001 > > Ø > > Ø R2 ns:dateEdited 6/6/2006 > > Ø > > Ø R1 owl:sameAs R2 then leads to ambiguity regarding the value of the > functional property ns:dateEdited > > > > Where R1 and R2 are representations or descriptions of a (real-world) > thing, possibly a graph of RDF triples. > > > > However, in a separate part of the thread, *Jeremy* wrote: > > > > Ø few people will care to name the representation / graph at all. > > > > In other words, the URIs R1 and R2 are usually not treated with much > respect. So it is unlikely that we would be in the business of making > sameAs statements about these. > > > > Simon > > > > > > *From:* Rob Atkinson [mailto:rob@metalinkage.com.au] > *Sent:* Saturday, 3 September 2016 8:05 AM > *To:* janowicz@ucsb.edu; Frans Knibbe <frans.knibbe@geodan.nl> > > > *Cc:* Joshua Lieberman <jlieberman@tumblingwalls.com>; Jeremy Tandy < > jeremy.tandy@gmail.com>; SDW WG Public List <public-sdw-wg@w3.org> > > *Subject:* Re: Clarification required: BP6 "use HTTP URIs for spatial > things" > > > > > > A few things - this is a rich discussion and we have identified several > parts (which is probably why the original issue was hard to pin down) > > > > I'm glad we have coaxed one elephant out - the sameAs semantics issue. > For me this is the litmus test whether a URL can be used as a URI for a > thing or not. > > > > (and this is where one of the issues about SIRF Jeremy raised comes in - > but I dont think we need to worry about specific approach, rather the > criteria for whether a URI is a good one for identification purposes. I > think we simply make a strong statement that you dont use a URL as a URI if > it is not stable and it does not make sense to use owl:sameAs. > > This pretty much rules out any direct URL to a single representation: > > > > consider two resources with URIs R1 and R2 > > > > R1 ns:dateEdited 12/1/2001 > > > > R2 ns:dateEdited 6/6/2006 > > > > R1 owl:sameAs R2 then leads to ambiguity regarding the value of the > functional property ns:dateEdited > > > > however > > > > U1 --303--> R1 > > U2 --303--> R2 > > > > can (and should be) represented as > > U1 ns:hasRepresentation R1 > > U2 ns:hasRepresentation R2 > > > > U1 owl:sameAS U2 > > entails > > U1 ns:hasRepresentation R1, R2 > > > > which doesnt make any stupid statements about the properties. It also > allows us to make useful metadata statements about R1, R2 as required. > > > > Whilst this is a general concern, we see issues of identification > stability, multiple representations, non-unique naming being significant to > spatial data and I think we can and should therefore extend the general > DWBP with an example using spatial representations and provide a more > concrete best practice. > > > > > > > > > > > > > > > > > > On Sat, 3 Sep 2016 at 00:40 Krzysztof Janowicz <janowicz@ucsb.edu> wrote: > > I am no expert on the matter, but several sources tell me that if <A> > <owl:sameAs> <B>, then all statements that can be made about A will also be > true for B, and vice versa. It seems that the lighthouse example breaks at > that point. For example, in Jeremy's example one of the lighthouse > representations has a height of 41 m. It is likely that that statement will > be false for the representation of the lighthouse as a ruin. > > > > Can we be sure that if we recommend using owl:sameAs to assert that two > resources are really the same thing, everyone and everything is aware of > the logical consequences? > > > > This is exactly the key point. If A owl:sameAs B than A and B signify the > same entity and thus every *statement* about A is a statement about B. It > works well with Jeremy's example. The fact that the ruin no longer is 41m > tall is an example of the need for spatiotemporal scoping of predicates not > a shortcoming of owl:sameAs. Also, keep in mind that RDF statements have > nothing to do with facts or truth; they are just sets of statements. This > is were the power of RDF comes from. > > Best, > Krzysztof > > > > > > On 09/02/2016 02:20 AM, Frans Knibbe wrote: > > > > On 1 September 2016 at 23:42, Krzysztof Janowicz <janowicz@ucsb.edu> > wrote: > > > Hi, > > > So as representations, these are not “owl:sameAs”. > > > > Just for clarification. owl:sameAs is only concerned with the mapping of > IRIs to (real world) entities and not 'representations' (leaving aside the > fact that everything is a representation in some sense). I.e., it is about > 'identity'. To give an extreme example, a URI may refer to the Eddystone > Lighthouse which may be classified as /Lighthouse/ in some repository. > Another URI established 50 years from now can still refer to this > particular (4th) lighthouse and classify it as a /Ruin/. Another 50 years > into the future, there may be yet another URI that refers to the fact that > at some stage there was a ruin here of the 4th lighthouse called Eddystone > while there is nothing physical left of it, and, thus, it is neither > classified as /Ruin/ nor /Lighthouse/. In fact, we do not even need to > introduce the concept of "real world" here as we can also establish a > sameAs relation between two URIs that point to Zeus. Please note that this > is different from establish a sameAs link between a particular statue of > Zeus in a particular museum and Zeus as the god of thunder. Finally, the > purpose of establishing sameAs links is typically data fusion/conflation > (no matter whether this is done ad-hoc, manually, or (offline) > computationally) . > > > > I am no expert on the matter, but several sources tell me that if <A> > <owl:sameAs> <B>, then all statements that can be made about A will also be > true for B, and vice versa. It seems that the lighthouse example breaks at > that point. For example, in Jeremy's example one of the lighthouse > representations has a height of 41 m. It is likely that that statement will > be false for the representation of the lighthouse as a ruin. > > > > Can we be sure that if we recommend using owl:sameAs to assert that two > resources are really the same thing, everyone and everything is aware of > the logical consequences? > > > > Regards, > > Frans > > > > > > > Best, > Jano > > > On 08/31/2016 06:38 AM, Joshua Lieberman wrote: > > Jeremy, > > > > So as representations, these are not “owl:sameAs”. We assume that as > feature data, each refers to a real world entity, but we don’t assert that > this VerticalObstruction is the same individual as this > MaritimeNavigationAid. We just are suspecting or asserting that the same > real world thing is being discerned in two different ways. Someone may > define a lighthouse class as subclassing both, otherwise a slightly > specialized relation (e.g. sdwgeo:sameRealWorldEntityAs) would be useful > here. > > > > Josh > > > > On Aug 31, 2016, at 8:41 AM, Jeremy Tandy <jeremy.tandy@gmail.com> wrote: > > > > > That still leaves a gap in expressing whether two feature data entities > represent the same real world entity. Perhaps we need a "sameFeatureAs" > predicate to address this. > > > > @josh - can we clarify my understanding please? > > > > In the BP doc §4 "Spatial things, features and geometry" [1] I use a > lighthouse example, so I'll continue with that ... > > > > We have one real lighthouse (Eddystone Lighthouse) that is discerned as a > different Type by different communities: "VerticalObstruction" and > "MaritimeNavigationAid". In ISO 19100 parlance, these are two distinct > feature types. The two "Features" might be encoded in GML as follows > (forgive any errors in my illustrative example): > > > > <VerticalObstruction gml:id="a"> > > <gml:name>Eddystone</gml:name> > > <gml:identifier codeSpace=" > http://example.com/sar/features/vo/">EDY</gml:identifier> > > <geometry> > > <gml:Point gml:id="a-p1" srsDimension="2" srsName="EPSG:4326"> > > <gml:pos>50.184 -4.268</gml:pos> > > </gml:Point> > > </geometry> > > <height uom="m">41</height> > > </VerticalObstruction> > > > > <MaritimeNavigationAid gml:id="b"> > > <gml:name>Eddystone Lighthouse</gml:name> > > <gml:identifier codeSpace="http://example.org/maritime/navaid/ > ">2650253</gml:identifier> > > <geo> > > <gml:Point gml:id="b-p1" srsDimension="2" srsName="EPSG:4326"> > > <gml:pos>50.2 -4.3</gml:pos> > > </gml:Point> > > </geo> > > <lightCharacteristic> > > ... > > </lightCharacteristic> > > </MaritimeNavigationAid> > > > > So we have two Features (which we collectively have agreed are "spatial > things"), with identifiers <http://example.com/sar/features/vo/EDY> and < > http://example.org/maritime/navaid/2650253>. Respectively, the XML > elements that describe these features are identified as "a" and "b" using > the @gml:id attribute. > > > > If we are using "indirect identification" then _both_ < > http://example.com/sar/features/vo/EDY> and < > http://example.org/maritime/navaid/2650253> are treated as identifiers > for the _real_ Eddystone Lighthouse; we simply don't care to differentiate > between the real world thing and the information record. In which case, > <owl:sameAs> would seem sufficient? The "height" and "lightCharacteristic" > properties are both applicable to the real Eddystone Lighthouse. Some > judgement would be required to decide which point geometry ("geo" or > "geometry" property) is considered "best". > > > > The way I think about it, @gml:id is more like the identifier for a named > graph; a container for a set of properties ... > > > > Am I missing something??? > > > > Jeremy > > > > > > [1]: http://w3c.github.io/sdw/bp/#spatial-things-features-and-geometry > > > > On Wed, 31 Aug 2016 at 12:42 Joshua Lieberman < > jlieberman@tumblingwalls.com> wrote: > > If we are asserting that spatial data on the Web is "always" feature data > that represents a real world entity, then yes, we don't have the general > Web "is it or isn't it physical" ambiguity and can assume that a feature > data identifier also and indirectly identifies the feature. That still > leaves a gap in expressing whether two feature data entities represent the > same real world entity. Perhaps we need a "sameFeatureAs" predicate to > address this. > > > > Josh > > Joshua Lieberman, Ph.D. > > Principal, Tumbling Walls Consultancy > > Tel/Direct: +1 617-431-6431 > > jlieberman@tumblingwalls.com > > > On Aug 31, 2016, at 07:29, Frans Knibbe <frans.knibbe@geodan.nl> wrote: > > Hello, > > > > As stated before, I don't think the httpRange-14 problem exists in our > domain of discourse. I think (and hope) that confusion can only occur when > the things that are described are digital things, or things that can be > transmitted over a computer network, like web pages or mail boxes. It seems > to me that spatial things are never that type of thing. Therefore there is > no reason to take precautions against possible confusion. > > > > That probably means +1. > > > > Greetings, > > Frans > > > > > > > > On 31 August 2016 at 09:50, Jeremy Tandy <jeremy.tandy@gmail.com> wrote: > > Thanks Rob & Clemens ... > > > > On Wed, 31 Aug 2016 at 08:30, Clemens Portele < > portele@interactive-instruments.de> wrote: > > +1 > > > > On 30 August 2016 at 10:10:26, Jeremy Tandy (jeremy.tandy@gmail.com) > wrote: > > Hi. It would be good to close this issue out & include our collective > recommendation in the BP doc working draft. > > > > PROPOSAL: SDW working group recommends use of "indirect identifiers" for > spatial things > > > > ... I'll start the voting. > > > > +1 > > > > Jeremy > > > > (BTW, to make sense of the PROPOSAL you'll need to read the email thread) > > > > On Fri, 26 Aug 2016 at 10:12 Linda van den Brink < > l.vandenbrink@geonovum.nl> wrote: > > So… do we agree we can recommend indirect identifiers, or do we try to fix > the issue with getting the correct identifier as Rob describes? > > > > While waiting for this I’ve updated the issue and the text referring to > the issue in BP6. > > > > *Van:* Rob Atkinson [mailto:rob@metalinkage.com.au] > *Verzonden:* woensdag 24 augustus 2016 13:56 > *Aan:* Jeremy Tandy; Phil Archer; Linda van den Brink; Bill Roberts > > > *CC:* SDW WG Public List > > *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for spatial > things" > > > > Hi > > > > Agree this is a real concern - people cant be blamed for doing the > obvious, if dumb, thing.. > > > > I think we should take note of best practice in the HTML world - which is > often to include a citable link to a resource in the rendered view. Or a > "share" or something similar. We can also put fairly explicit annotation in > machine-readable code - stating that the resource is about the URI - and > even notes saying when citing this resource use the URI.... > > > > I'd also like to see browsers evolve to offer you the original link or the > redirected when cutting and pasting - how hard can it be! > > > > Maybe we can get Ed to ask around Google Chrome team for suggestions on > how best to handle this :-) > > > > Rob > > > > > > > > On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy <jeremy.tandy@gmail.com> wrote: > > Yes, I think so ... And we should do so if we are recommending "indirect > identification". > > Jeremy > > On Wed, 24 Aug 2016 at 09:24, Phil Archer <phila@w3.org> wrote: > > Bill's comments also made me think about some of the classic arguments, > such as that a lake doesn't have a last updated date and isn't 435KB > big. Which are true, however, that kind of metadata generally comes from > the server, i.e. the HTTP layer. That's an over simplification but the > point is that it is relatively easy to avoid deliberately creating > misleading metadata - metadata about the doc rather than the thing it > describes - and it's also generally easy to avoid looking for that > metadata. > > Is there scope for some BP advice there? > > Phil. > > On 24/08/2016 08:25, Jeremy Tandy wrote: > > Thanks Linda. More clear examples where being "correct" (in terms of > > avoiding uri collisions by using two distinct uris) is making things > worse > > because users take the wrong one! > > > > So, as a WG, are we content to recommend this "indirect identification" > > pattern where thing & info resource identifiers are conflated? > > > > Bill has added some good points about how to avoid impacts of uri > > collision- by using the (dataset) metadata to talk about licenses and > > creators for the information ... > > On Wed, 24 Aug 2016 at 07:52, Linda van den Brink < > l.vandenbrink@geonovum.nl> > > wrote: > > > >> Experience from the Netherlands: we have the id/doc pattern in our URI > >> strategy, based on the Cool URIs note [8] and the ISA study on > persistent > >> identifiers [9]. > >> > >> > >> > >> That being said, same as Bill I also notice data users getting confused > >> and generally using the /doc/ URI as that is the one they can copy from > >> their browser address bar. This is not only casual confusion but also > ends > >> up in published information resources. > >> > >> > >> > >> You see this, for example, all over the CB-NL which is a vocabulary for > >> the building sector and contains links to other Dutch standards such as > >> IMGeo, an information model and vocabulary for large scale topography. > E.g. > >> the CB-NL concept of ‘Gebouw’ (Building) [10] links to two IMGeo > concepts > >> ‘Pand’ (building part) and ‘Overig Bouwwerk’ (other construction) using > >> their /doc/ URIs. If you click on Pand (which doesn’t have its own > landing > >> page in CB-NL so I can’t include the link) you will see it includes the > >> /doc/ URI as the identifier of Pand. > >> > >> > >> > >> This is an example where it occurs in vocabularies, but I also see it > >> happen with identifiers for data instances. > >> > >> > >> > >> [8]: https://www.w3.org/TR/cooluris/ > >> > >> [9]: > >> > https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%20Study%20on%20persistent%20URIs_0.pdf > >> 10: http://ont.cbnl.org/cb/def/Gebouw > >> > >> > >> > >> Linda > >> > >> > >> > >> *Van:* Jeremy Tandy [mailto:jeremy.tandy@gmail.com] > >> *Verzonden:* dinsdag 23 augustus 2016 20:57 > >> *Aan:* Bill Roberts > >> *CC:* SDW WG Public List > >> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for spatial > >> things" > >> > >> > >> > >> Thanks Bill. Sounds very coherent ... I hoped for some responses such as > >> this based on practical experience. Jeremy > >> > >> On Tue, 23 Aug 2016 at 19:41, Bill Roberts <bill@swirrl.com> wrote: > >> > >> ah Jeremy, you are a brave man to poke the sleeping beast of > httpRange-14. > >> > >> > >> > >> But I'll get my thoughts in early, then I can tune out of the ensuing > mail > >> avalanche :-) > >> > >> > >> > >> When publishing Linked Data about places we (at Swirrl) generally do the > >> id/doc fandango, but to be honest I think data users either don't > notice, > >> or they get confused by it. In the applications we are working with > (and I > >> acknowledge that others may have different applications and different > >> experiences), it wouldn't cause any problems to have a single URI, the > 'id' > >> URI if you like. We just don't find a need to say anything about the > /doc/ > >> URI. If we were starting again, I'd probably ditch the /doc/ and the > 303 > >> and rely on context and a little bit of documentation to make it clear > what > >> we mean. > >> > >> > >> > >> The place where we find a need to talk about creators and licences and > >> modified dates is in metadata about datasets where a dataset might be a > >> collection of information about a bunch of places - and we treat > datasets > >> as an 'information resource'. If someone requests a dataset URI we > return a > >> status code of 200 and the dataset metadata as the response. That > metadata > >> includes info on where to get all the contents of the dataset if you > want > >> that. > >> > >> > >> > >> By the way, though it's sensible and consistent, I find that the implied > >> and parallel property stuff makes it more rather than less complicated. > >> > >> > >> > >> Bill > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> On 23 August 2016 at 17:37, Jeremy Tandy <jeremy.tandy@gmail.com> > wrote: > >> > >> All- > >> > >> > >> > >> Linda has done a great job of consolidating the best practices are use > of > >> identifiers. We have just one [1] now. > >> > >> > >> > >> Reading though just now, it occurred to me that there's still an open > >> issue about identifier assignment ... > >> > >> > >> > >> W3C's Architecture of the World Wide Web constraint "URIs identify a > >> single resource" [2] asserts "Assign distinct URIs to distinct > resources" > >> in order to avoid URI collisions [2a] which "often imposes a cost in > >> communication due to the effort required to resolve ambiguities". > >> Discussions from earlier years in UK Gov Linked Data working group (and > >> elsewhere) concluded that the "real world thing" and "information > resource > >> that describes the real world thing" are separate resources. I think > this > >> is based on a (purist?) view when working with RDF of needing to be > totally > >> clear on "what's the subject" of each triple ... the thing or the > document. > >> This manifests as URIs with `id` or `doc` included somewhere to > distinguish > >> between the resources and some RDF triples to clarify that the doc > resource > >> is talking about the thing resource etc.. > >> > >> > >> > >> (dangerously close to "httpRange-14" [3] here ... let's avoid that bear > >> trap) > >> > >> > >> > >> Jeni Tennison's "URLs in Data Primer" draft TAG note captures this > >> practice in §5.3 "Publishing data" [4]: > >> > >> > >> > >> ``` > >> > >> Publishers can help enable more accurate merging of data from different > >> sites if they support URLs for each entity > >> <https://www.w3.org/TR/urls-in-data/#dfn-entity> they or other sites > may > >> wish to describe, separate from the landing pages > >> <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> or records > >> <https://www.w3.org/TR/urls-in-data/#dfn-record> that they publish. > >> > >> ``` > >> > >> > >> > >> Yet Architecture of the World Wide Web §2.2.3 "Indirect identification" > >> [5] notes that: > >> > >> > >> > >> ``` > >> > >> To say that the URI "mailto:nadia@example.com" identifies both an > >> Internet mailbox and Nadia, the person, introduces a URI collision. > >> However, we can use the URI to indirectly identify Nadia. Identifiers > are > >> commonly used in this way. > >> > >> ``` > >> > >> > >> > >> This is consistent with what I recall TimBL saying at TPAC-2015 in > regards > >> to Vcard; come the finish, no one really cares to distinguish between > the > >> thing and its associated information resource. > >> > >> > >> > >> ... And in most cases, one can use context to determine whether a > >> statement concerns the thing or the information resource. In those cases > >> where you can't, "URLs in Data Primer" suggests some mechanisms to > mitigate > >> such confusion [6][7]. > >> > >> > >> > >> I think that in our SDW WG discussion we have concluded that we _are_ > >> content to use "indirect identification" - e.g. that we use URIs that > >> conflate the thing and document resource. > >> > >> > >> > >> Please can we confirm this? Assuming that indirect identification is > >> > >
Received on Monday, 5 September 2016 01:00:32 UTC