- From: Krzysztof Janowicz <janowicz@ucsb.edu>
- Date: Sun, 4 Sep 2016 20:31:05 -0700
- To: Simon.Cox@csiro.au, rob@metalinkage.com.au, frans.knibbe@geodan.nl
- Cc: jlieberman@tumblingwalls.com, jeremy.tandy@gmail.com, public-sdw-wg@w3.org
- Message-ID: <b3f644e1-c28f-bab5-1ce0-9986cd81f525@ucsb.edu>
*Hi, * > > *Rob* wrote: > > Øconsider two resources with URIs R1 and R2 > > Ø > > ØR1 ns:dateEdited 12/1/2001 > > Ø > > ØR2 ns:dateEdited 6/6/2006 > > Ø > > ØR1 owl:sameAs R2 then leads to ambiguity regarding the value of the > functional property ns:dateEdited > Note that this is not a owl:sameAs issue. I think it is very important to distinguish between co-reference resolutions (using owl:SameAs, skos:closeMatch,...) and data conflation (data fusion). owl:SameAs handles co-reference resolution. Data fusion is still an open research issue (despite tons of work in the DB community). The fact that ns:dateEdited may be defined as a functional property in some ontology will also have no effect on the RDF triples as such. Best, Krzysztof On 09/04/2016 05:04 PM, Simon.Cox@csiro.au wrote: > > *Rob* wrote: > > Øconsider two resources with URIs R1 and R2 > > Ø > > ØR1 ns:dateEdited 12/1/2001 > > Ø > > ØR2 ns:dateEdited 6/6/2006 > > Ø > > ØR1 owl:sameAs R2 then leads to ambiguity regarding the value of the > functional property ns:dateEdited > > Where R1 and R2 are representations or descriptions of a (real-world) > thing, possibly a graph of RDF triples. > > However, in a separate part of the thread, *Jeremy* wrote: > > Øfew people will care to name the representation / graph at all. > > In other words, the URIs R1 and R2 are usually not treated with much > respect. So it is unlikely that we would be in the business of making > sameAs statements about these. > > Simon > > *From:*Rob Atkinson [mailto:rob@metalinkage.com.au] > *Sent:* Saturday, 3 September 2016 8:05 AM > *To:* janowicz@ucsb.edu; Frans Knibbe <frans.knibbe@geodan.nl> > *Cc:* Joshua Lieberman <jlieberman@tumblingwalls.com>; Jeremy Tandy > <jeremy.tandy@gmail.com>; SDW WG Public List <public-sdw-wg@w3.org> > *Subject:* Re: Clarification required: BP6 "use HTTP URIs for spatial > things" > > A few things - this is a rich discussion and we have identified > several parts (which is probably why the original issue was hard to > pin down) > > I'm glad we have coaxed one elephant out - the sameAs semantics > issue. For me this is the litmus test whether a URL can be used as a > URI for a thing or not. > > (and this is where one of the issues about SIRF Jeremy raised comes in > - but I dont think we need to worry about specific approach, rather > the criteria for whether a URI is a good one for identification > purposes. I think we simply make a strong statement that you dont use > a URL as a URI if it is not stable and it does not make sense to use > owl:sameAs. > > This pretty much rules out any direct URL to a single representation: > > consider two resources with URIs R1 and R2 > > R1 ns:dateEdited 12/1/2001 > > R2 ns:dateEdited 6/6/2006 > > R1 owl:sameAs R2 then leads to ambiguity regarding the value of the > functional property ns:dateEdited > > however > > U1 --303--> R1 > > U2 --303--> R2 > > can (and should be) represented as > > U1 ns:hasRepresentation R1 > > U2 ns:hasRepresentation R2 > > U1 owl:sameAS U2 > > entails > > U1 ns:hasRepresentation R1, R2 > > which doesnt make any stupid statements about the properties. It also > allows us to make useful metadata statements about R1, R2 as required. > > Whilst this is a general concern, we see issues of identification > stability, multiple representations, non-unique naming being > significant to spatial data and I think we can and should therefore > extend the general DWBP with an example using spatial representations > and provide a more concrete best practice. > > On Sat, 3 Sep 2016 at 00:40 Krzysztof Janowicz <janowicz@ucsb.edu > <mailto:janowicz@ucsb.edu>> wrote: > > I am no expert on the matter, but several sources tell me that > if <A> <owl:sameAs> <B>, then all statements that can be made > about A will also be true for B, and vice versa. It seems that > the lighthouse example breaks at that point. For example, in > Jeremy's example one of the lighthouse representations has a > height of 41 m. It is likely that that statement will be false > for the representation of the lighthouse as a ruin. > > Can we be sure that if we recommend using owl:sameAs to assert > that two resources are really the same thing, everyone and > everything is aware of the logical consequences? > > This is exactly the key point. If A owl:sameAs B than A and B > signify the same entity and thus every *statement* about A is a > statement about B. It works well with Jeremy's example. The fact > that the ruin no longer is 41m tall is an example of the need for > spatiotemporal scoping of predicates not a shortcoming of > owl:sameAs. Also, keep in mind that RDF statements have nothing to > do with facts or truth; they are just sets of statements. This is > were the power of RDF comes from. > > Best, > Krzysztof > > > > > > On 09/02/2016 02:20 AM, Frans Knibbe wrote: > > On 1 September 2016 at 23:42, Krzysztof Janowicz > <janowicz@ucsb.edu <mailto:janowicz@ucsb.edu>> wrote: > > > Hi, > > > So as representations, these are not “owl:sameAs”. > > > > Just for clarification. owl:sameAs is only concerned with > the mapping of IRIs to (real world) entities and not > 'representations' (leaving aside the fact that everything > is a representation in some sense). I.e., it is about > 'identity'. To give an extreme example, a URI may refer to > the Eddystone Lighthouse which may be classified as > /Lighthouse/ in some repository. Another URI established > 50 years from now can still refer to this particular (4th) > lighthouse and classify it as a /Ruin/. Another 50 years > into the future, there may be yet another URI that refers > to the fact that at some stage there was a ruin here of > the 4th lighthouse called Eddystone while there is nothing > physical left of it, and, thus, it is neither classified > as /Ruin/ nor /Lighthouse/. In fact, we do not even need > to introduce the concept of "real world" here as we can > also establish a sameAs relation between two URIs that > point to Zeus. Please note that this is different from > establish a sameAs link between a particular statue of > Zeus in a particular museum and Zeus as the god of > thunder. Finally, the purpose of establishing sameAs links > is typically data fusion/conflation (no matter whether > this is done ad-hoc, manually, or (offline) computationally) . > > I am no expert on the matter, but several sources tell me that > if <A> <owl:sameAs> <B>, then all statements that can be made > about A will also be true for B, and vice versa. It seems that > the lighthouse example breaks at that point. For example, in > Jeremy's example one of the lighthouse representations has a > height of 41 m. It is likely that that statement will be false > for the representation of the lighthouse as a ruin. > > Can we be sure that if we recommend using owl:sameAs to assert > that two resources are really the same thing, everyone and > everything is aware of the logical consequences? > > Regards, > > Frans > > > Best, > Jano > > > On 08/31/2016 06:38 AM, Joshua Lieberman wrote: > > Jeremy, > > So as representations, these are not “owl:sameAs”. We > assume that as feature data, each refers to a real > world entity, but we don’t assert that this > VerticalObstruction is the same individual as this > MaritimeNavigationAid. We just are suspecting or > asserting that the same real world thing is being > discerned in two different ways. Someone may define a > lighthouse class as subclassing both, otherwise a > slightly specialized relation (e.g. > sdwgeo:sameRealWorldEntityAs) would be useful here. > > Josh > > On Aug 31, 2016, at 8:41 AM, Jeremy Tandy > <jeremy.tandy@gmail.com > <mailto:jeremy.tandy@gmail.com>> wrote: > > > That still leaves a gap in expressing whether > two feature data entities represent the same real > world entity. Perhaps we need a "sameFeatureAs" > predicate to address this. > > @josh - can we clarify my understanding please? > > In the BP doc §4 "Spatial things, features and > geometry" [1] I use a lighthouse example, so I'll > continue with that ... > > We have one real lighthouse (Eddystone Lighthouse) > that is discerned as a different Type by different > communities: "VerticalObstruction" and > "MaritimeNavigationAid". In ISO 19100 parlance, > these are two distinct feature types. The two > "Features" might be encoded in GML as follows > (forgive any errors in my illustrative example): > > <VerticalObstruction gml:id="a"> > > <gml:name>Eddystone</gml:name> > > <gml:identifier > codeSpace="http://example.com/sar/features/vo/">EDY</gml:identifier > <http://example.com/sar/features/vo/%22%3EEDY%3C/gml:identifier>> > > <geometry> > > <gml:Point gml:id="a-p1" srsDimension="2" > srsName="EPSG:4326"> > > <gml:pos>50.184 -4.268</gml:pos> > > </gml:Point> > > </geometry> > > <height uom="m">41</height> > > </VerticalObstruction> > > <MaritimeNavigationAid gml:id="b"> > > <gml:name>Eddystone Lighthouse</gml:name> > > <gml:identifier > codeSpace="http://example.org/maritime/navaid/">2650253</gml:identifier> > > <geo> > > <gml:Point gml:id="b-p1" srsDimension="2" > srsName="EPSG:4326"> > > <gml:pos>50.2 -4.3</gml:pos> > > </gml:Point> > > </geo> > > <lightCharacteristic> > > ... > > </lightCharacteristic> > > </MaritimeNavigationAid> > > So we have two Features (which we collectively > have agreed are "spatial things"), with > identifiers > <http://example.com/sar/features/vo/EDY> and > <http://example.org/maritime/navaid/2650253>. > Respectively, the XML elements that describe these > features are identified as "a" and "b" using the > @gml:id attribute. > > If we are using "indirect identification" then > _both_ <http://example.com/sar/features/vo/EDY> > and <http://example.org/maritime/navaid/2650253> > are treated as identifiers for the _real_ > Eddystone Lighthouse; we simply don't care to > differentiate between the real world thing and the > information record. In which case, <owl:sameAs> > would seem sufficient? The "height" and > "lightCharacteristic" properties are both > applicable to the real Eddystone Lighthouse. Some > judgement would be required to decide which point > geometry ("geo" or "geometry" property) is > considered "best". > > The way I think about it, @gml:id is more like the > identifier for a named graph; a container for a > set of properties ... > > Am I missing something??? > > Jeremy > > [1]: > http://w3c.github.io/sdw/bp/#spatial-things-features-and-geometry > > > On Wed, 31 Aug 2016 at 12:42 Joshua Lieberman > <jlieberman@tumblingwalls.com > <mailto:jlieberman@tumblingwalls.com>> wrote: > > If we are asserting that spatial data on the > Web is "always" feature data that represents a > real world entity, then yes, we don't have the > general Web "is it or isn't it physical" > ambiguity and can assume that a feature data > identifier also and indirectly identifies the > feature. That still leaves a gap in expressing > whether two feature data entities represent > the same real world entity. Perhaps we need a > "sameFeatureAs" predicate to address this. > > Josh > > Joshua Lieberman, Ph.D. > > Principal, Tumbling Walls Consultancy > > Tel/Direct: +1 617-431-6431 > <tel:%2B1%20617-431-6431> > > jlieberman@tumblingwalls.com > <mailto:jlieberman@tumblingwalls.com> > > > On Aug 31, 2016, at 07:29, Frans Knibbe > <frans.knibbe@geodan.nl > <mailto:frans.knibbe@geodan.nl>> wrote: > > Hello, > > As stated before, I don't think the > httpRange-14 problem exists in our domain > of discourse. I think (and hope) that > confusion can only occur when the things > that are described are digital things, or > things that can be transmitted over a > computer network, like web pages or mail > boxes. It seems to me that spatial things > are never that type of thing. Therefore > there is no reason to take precautions > against possible confusion. > > That probably means +1. > > Greetings, > > Frans > > On 31 August 2016 at 09:50, Jeremy Tandy > <jeremy.tandy@gmail.com > <mailto:jeremy.tandy@gmail.com>> wrote: > > Thanks Rob & Clemens ... > > On Wed, 31 Aug 2016 at 08:30, Clemens > Portele > <portele@interactive-instruments.de > <mailto:portele@interactive-instruments.de>> > wrote: > > +1 > > On 30 August 2016 at 10:10:26, > Jeremy Tandy > (jeremy.tandy@gmail.com > <mailto:jeremy.tandy@gmail.com>) > wrote: > > Hi. It would be good to close > this issue out & include our > collective recommendation in > the BP doc working draft. > > PROPOSAL: SDW working group > recommends use of "indirect > identifiers" for spatial things > > ... I'll start the voting. > > +1 > > Jeremy > > (BTW, to make sense of the > PROPOSAL you'll need to read > the email thread) > > On Fri, 26 Aug 2016 at 10:12 > Linda van den Brink > <l.vandenbrink@geonovum.nl > <mailto:l.vandenbrink@geonovum.nl>> > wrote: > > So… do we agree we can > recommend indirect > identifiers, or do we try > to fix the issue with > getting the correct > identifier as Rob describes? > > While waiting for this > I’ve updated the issue and > the text referring to the > issue in BP6. > > *Van:*Rob Atkinson > [mailto:rob@metalinkage.com.au > <mailto:rob@metalinkage.com.au>] > *Verzonden:* woensdag 24 > augustus 2016 13:56 > *Aan:* Jeremy Tandy; Phil > Archer; Linda van den > Brink; Bill Roberts > > > *CC:* SDW WG Public List > > *Onderwerp:*Re: > Clarification required: > BP6 "use HTTP URIs for > spatial things" > > Hi > > Agree this is a real > concern - people cant be > blamed for doing the > obvious, if dumb, thing.. > > I think we should take > note of best practice in > the HTML world - which is > often to include a citable > link to a resource in the > rendered view. Or a > "share" or something > similar. We can also put > fairly explicit annotation > in machine-readable code - > stating that the resource > is about the URI - and > even notes saying when > citing this resource use > the URI.... > > I'd also like to see > browsers evolve to offer > you the original link or > the redirected when > cutting and pasting - how > hard can it be! > > Maybe we can get Ed to ask > around Google Chrome team > for suggestions on how > best to handle this :-) > > Rob > > On Wed, 24 Aug 2016 at > 18:27 Jeremy Tandy > <jeremy.tandy@gmail.com > <mailto:jeremy.tandy@gmail.com>> > wrote: > > Yes, I think so ... > And we should do so if > we are recommending > "indirect identification". > > Jeremy > > On Wed, 24 Aug 2016 at > 09:24, Phil Archer > <phila@w3.org > <mailto:phila@w3.org>> > wrote: > > Bill's comments > also made me think > about some of the > classic arguments, > such as that a > lake doesn't have > a last updated > date and isn't 435KB > big. Which are > true, however, > that kind of > metadata generally > comes from > the server, i.e. > the HTTP layer. > That's an over > simplification but the > point is that it > is relatively easy > to avoid > deliberately creating > misleading > metadata - > metadata about the > doc rather than > the thing it > describes - and > it's also > generally easy to > avoid looking for > that metadata. > > Is there scope for > some BP advice there? > > Phil. > > On 24/08/2016 > 08:25, Jeremy > Tandy wrote: > > Thanks Linda. > More clear > examples where > being "correct" > (in terms of > > avoiding uri > collisions by > using two distinct > uris) is making > things worse > > because users > take the wrong one! > > > > So, as a WG, are > we content to > recommend this > "indirect > identification" > > pattern where > thing & info > resource > identifiers are > conflated? > > > > Bill has added > some good points > about how to avoid > impacts of uri > > collision- by > using the > (dataset) metadata > to talk about > licenses and > > creators for the > information ... > > On Wed, 24 Aug > 2016 at 07:52, > Linda van den > Brink > <l.vandenbrink@geonovum.nl > <mailto:l.vandenbrink@geonovum.nl>> > > wrote: > > > >> Experience from > the Netherlands: > we have the id/doc > pattern in our URI > >> strategy, based > on the Cool URIs > note [8] and the > ISA study on > persistent > >> identifiers [9]. > >> > >> > >> > >> That being > said, same as Bill > I also notice data > users getting confused > >> and generally > using the /doc/ > URI as that is the > one they can copy from > >> their browser > address bar. This > is not only casual > confusion but also > ends > >> up in published > information resources. > >> > >> > >> > >> You see this, > for example, all > over the CB-NL > which is a > vocabulary for > >> the building > sector and > contains links to > other Dutch > standards such as > >> IMGeo, an > information model > and vocabulary for > large scale > topography. E.g. > >> the CB-NL > concept of > ‘Gebouw’ > (Building) [10] > links to two IMGeo > concepts > >> ‘Pand’ > (building part) > and ‘Overig > Bouwwerk’ (other > construction) using > >> their /doc/ > URIs. If you click > on Pand (which > doesn’t have its > own landing > >> page in CB-NL > so I can’t include > the link) you will > see it includes the > >> /doc/ URI as > the identifier of > Pand. > >> > >> > >> > >> This is an > example where it > occurs in > vocabularies, but > I also see it > >> happen with > identifiers for > data instances. > >> > >> > >> > >> [8]: > https://www.w3.org/TR/cooluris/ > >> > >> [9]: > >> > https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%20Study%20on%20persistent%20URIs_0.pdf > >> 10: > http://ont.cbnl.org/cb/def/Gebouw > >> > >> > >> > >> Linda > >> > >> > >> > >> *Van:* Jeremy > Tandy > [mailto:jeremy.tandy@gmail.com > <mailto:jeremy.tandy@gmail.com>] > >> *Verzonden:* > dinsdag 23 > augustus 2016 20:57 > >> *Aan:* Bill Roberts > >> *CC:* SDW WG > Public List > >> *Onderwerp:* > Re: Clarification > required: BP6 "use > HTTP URIs for spatial > >> things" > >> > >> > >> > >> Thanks Bill. > Sounds very > coherent ... I > hoped for some > responses such as > >> this based on > practical > experience. Jeremy > >> > >> On Tue, 23 Aug > 2016 at 19:41, > Bill Roberts > <bill@swirrl.com > <mailto:bill@swirrl.com>> > wrote: > >> > >> ah Jeremy, you > are a brave man to > poke the sleeping > beast of httpRange-14. > >> > >> > >> > >> But I'll get my > thoughts in early, > then I can tune > out of the ensuing > mail > >> avalanche :-) > >> > >> > >> > >> When publishing > Linked Data about > places we (at > Swirrl) generally > do the > >> id/doc > fandango, but to > be honest I think > data users either > don't notice, > >> or they get > confused by it. > In the > applications we > are working with > (and I > >> acknowledge > that others may > have different > applications and > different > >> experiences), > it wouldn't cause > any problems to > have a single URI, > the 'id' > >> URI if you > like. We just > don't find a need > to say anything > about the /doc/ > >> URI. If we were > starting again, > I'd probably ditch > the /doc/ and the 303 > >> and rely on > context and a > little bit of > documentation to > make it clear what > >> we mean. > >> > >> > >> > >> The place where > we find a need to > talk about > creators and > licences and > >> modified dates > is in metadata > about datasets > where a dataset > might be a > >> collection of > information about > a bunch of places > - and we treat > datasets > >> as an > 'information > resource'. If > someone requests a > dataset URI we > return a > >> status code of > 200 and the > dataset metadata > as the response. > That metadata > >> includes info > on where to get > all the contents > of the dataset if > you want > >> that. > >> > >> > >> > >> By the way, > though it's > sensible and > consistent, I find > that the implied > >> and parallel > property stuff > makes it more > rather than less > complicated. > >> > >> > >> > >> Bill > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> On 23 August > 2016 at 17:37, > Jeremy Tandy > <jeremy.tandy@gmail.com > <mailto:jeremy.tandy@gmail.com>> > wrote: > >> > >> All- > >> > >> > >> > >> Linda has done > a great job of > consolidating the > best practices are > use of > >> identifiers. We > have just one [1] now. > >> > >> > >> > >> Reading though > just now, it > occurred to me > that there's still > an open > >> issue about > identifier > assignment ... > >> > >> > >> > >> W3C's > Architecture of > the World Wide Web > constraint "URIs > identify a > >> single > resource" [2] > asserts "Assign > distinct URIs to > distinct resources" > >> in order to > avoid URI > collisions [2a] > which "often > imposes a cost in > >> communication > due to the effort > required to > resolve ambiguities". > >> Discussions > from earlier years > in UK Gov Linked > Data working group > (and > >> elsewhere) > concluded that the > "real world thing" > and "information > resource > >> that describes > the real world > thing" are > separate > resources. I think > this > >> is based on a > (purist?) view > when working with > RDF of needing to > be totally > >> clear on > "what's the > subject" of each > triple ... the > thing or the document. > >> This manifests > as URIs with `id` > or `doc` included > somewhere to > distinguish > >> between the > resources and some > RDF triples to > clarify that the > doc resource > >> is talking > about the thing > resource etc.. > >> > >> > >> > >> (dangerously > close to > "httpRange-14" [3] > here ... let's > avoid that bear > >> trap) > >> > >> > >> > >> Jeni Tennison's > "URLs in Data > Primer" draft TAG > note captures this > >> practice in > §5.3 "Publishing > data" [4]: > >> > >> > >> > >> ``` > >> > >> Publishers can > help enable more > accurate merging > of data from different > >> sites if they > support URLs for > each entity > >> > <https://www.w3.org/TR/urls-in-data/#dfn-entity> > they or other > sites may > >> wish to > describe, separate > from the landing pages > >> > <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> > or records > >> > <https://www.w3.org/TR/urls-in-data/#dfn-record> > that they publish. > >> > >> ``` > >> > >> > >> > >> Yet > Architecture of > the World Wide Web > §2.2.3 "Indirect > identification" > >> [5] notes that: > >> > >> > >> > >> ``` > >> > >> To say that the > URI > "mailto:nadia@example.com > <mailto:nadia@example.com>" > identifies both an > >> Internet > mailbox and Nadia, > the person, > introduces a URI > collision. > >> However, we can > use the URI to > indirectly > identify Nadia. > Identifiers are > >> commonly used > in this way. > >> > >> ``` > >> > >> > >> > >> This is > consistent with > what I recall > TimBL saying at > TPAC-2015 in regards > >> to Vcard; come > the finish, no one > really cares to > distinguish > between the > >> thing and its > associated > information resource. > >> > >> > >> > >> ... And in most > cases, one can use > context to > determine whether a > >> statement > concerns the thing > or the information > resource. In those > cases > >> where you > can't, "URLs in > Data Primer" > suggests some > mechanisms to mitigate > >> such confusion > [6][7]. > >> > >> > >> > >> I think that in > our SDW WG > discussion we have > concluded that we > _are_ > >> content to use > "indirect > identification" - > e.g. that we use > URIs that > >> conflate the > thing and document > resource. > >> > >> > >> > >> Please can we > confirm this? > Assuming that > indirect > identification is > >> > -- Krzysztof Janowicz Geography Department, University of California, Santa Barbara 4830 Ellison Hall, Santa Barbara, CA 93106-4060 Email: jano@geog.ucsb.edu Webpage: http://geog.ucsb.edu/~jano/ Semantic Web Journal: http://www.semantic-web-journal.net
Received on Monday, 5 September 2016 03:31:39 UTC