- From: Krzysztof Janowicz <janowicz@ucsb.edu>
- Date: Thu, 1 Sep 2016 14:42:46 -0700
- To: Joshua Lieberman <jlieberman@tumblingwalls.com>, Jeremy Tandy <jeremy.tandy@gmail.com>
- Cc: Frans Knibbe <frans.knibbe@geodan.nl>, SDW WG Public List <public-sdw-wg@w3.org>
- Message-ID: <1190e136-6e0e-cb97-3c13-93b3e17411cc@ucsb.edu>
Hi, > So as representations, these are not “owl:sameAs”. Just for clarification. owl:sameAs is only concerned with the mapping of IRIs to (real world) entities and not 'representations' (leaving aside the fact that everything is a representation in some sense). I.e., it is about 'identity'. To give an extreme example, a URI may refer to the Eddystone Lighthouse which may be classified as /Lighthouse/ in some repository. Another URI established 50 years from now can still refer to this particular (4th) lighthouse and classify it as a /Ruin/. Another 50 years into the future, there may be yet another URI that refers to the fact that at some stage there was a ruin here of the 4th lighthouse called Eddystone while there is nothing physical left of it, and, thus, it is neither classified as /Ruin/ nor /Lighthouse/. In fact, we do not even need to introduce the concept of "real world" here as we can also establish a sameAs relation between two URIs that point to Zeus. Please note that this is different from establish a sameAs link between a particular statue of Zeus in a particular museum and Zeus as the god of thunder. Finally, the purpose of establishing sameAs links is typically data fusion/conflation (no matter whether this is done ad-hoc, manually, or (offline) computationally) . Best, Jano On 08/31/2016 06:38 AM, Joshua Lieberman wrote: > Jeremy, > > So as representations, these are not “owl:sameAs”. We assume that as > feature data, each refers to a real world entity, but we don’t assert > that this VerticalObstruction is the same individual as this > MaritimeNavigationAid. We just are suspecting or asserting that the > same real world thing is being discerned in two different ways. > Someone may define a lighthouse class as subclassing both, otherwise a > slightly specialized relation (e.g. sdwgeo:sameRealWorldEntityAs) > would be useful here. > > Josh > >> On Aug 31, 2016, at 8:41 AM, Jeremy Tandy <jeremy.tandy@gmail.com >> <mailto:jeremy.tandy@gmail.com>> wrote: >> >> > That still leaves a gap in expressing whether two feature data >> entities represent the same real world entity.Perhaps we need a >> "sameFeatureAs" predicate to address this. >> >> @josh - can we clarify my understanding please? >> >> In the BP doc §4 "Spatial things, features and geometry" [1] I use a >> lighthouse example, so I'll continue with that ... >> >> We have one real lighthouse (Eddystone Lighthouse) that is discerned >> as a different Type by different communities: "VerticalObstruction" >> and "MaritimeNavigationAid". In ISO 19100 parlance, these are two >> distinct feature types. The two "Features" might be encoded in GML as >> follows (forgive any errors in my illustrative example): >> >> <VerticalObstruction gml:id="a"> >> <gml:name>Eddystone</gml:name> >> <gml:identifier >> codeSpace="http://example.com/sar/features/vo/">EDY</gml:identifier >> <http://example.com/sar/features/vo/%22%3EEDY%3C/gml:identifier>> >> <geometry> >> <gml:Point gml:id="a-p1" srsDimension="2" srsName="EPSG:4326"> >> <gml:pos>50.184 -4.268</gml:pos> >> </gml:Point> >> </geometry> >> <height uom="m">41</height> >> </VerticalObstruction> >> >> <MaritimeNavigationAid gml:id="b"> >> <gml:name>Eddystone Lighthouse</gml:name> >> <gml:identifier >> codeSpace="http://example.org/maritime/navaid/">2650253</gml:identifier> >> <geo> >> <gml:Point gml:id="b-p1" srsDimension="2" srsName="EPSG:4326"> >> <gml:pos>50.2 -4.3</gml:pos> >> </gml:Point> >> </geo> >> <lightCharacteristic> >> ... >> </lightCharacteristic> >> </MaritimeNavigationAid> >> >> So we have two Features (which we collectively have agreed are >> "spatial things"), with identifiers >> <http://example.com/sar/features/vo/EDY> and >> <http://example.org/maritime/navaid/2650253>. Respectively, the XML >> elements that describe these features are identified as "a" and "b" >> using the @gml:id attribute. >> >> If we are using "indirect identification" then _both_ >> <http://example.com/sar/features/vo/EDY> and >> <http://example.org/maritime/navaid/2650253> are treated as >> identifiers for the _real_ Eddystone Lighthouse; we simply don't care >> to differentiate between the real world thing and the information >> record. In which case, <owl:sameAs> would seem sufficient? The >> "height" and "lightCharacteristic" properties are both applicable to >> the real Eddystone Lighthouse. Some judgement would be required to >> decide which point geometry ("geo" or "geometry" property) is >> considered "best". >> >> The way I think about it, @gml:id is more like the identifier for a >> named graph; a container for a set of properties ... >> >> Am I missing something??? >> >> Jeremy >> >> >> [1]: http://w3c.github.io/sdw/bp/#spatial-things-features-and-geometry >> >> On Wed, 31 Aug 2016 at 12:42 Joshua Lieberman >> <jlieberman@tumblingwalls.com <mailto:jlieberman@tumblingwalls.com>> >> wrote: >> >> If we are asserting that spatial data on the Web is "always" >> feature data that represents a real world entity, then yes, we >> don't have the general Web "is it or isn't it physical" ambiguity >> and can assume that a feature data identifier also and indirectly >> identifies the feature. That still leaves a gap in expressing >> whether two feature data entities represent the same real world >> entity. Perhaps we need a "sameFeatureAs" predicate to address this. >> >> Josh >> >> Joshua Lieberman, Ph.D. >> Principal, Tumbling Walls Consultancy >> Tel/Direct: +1 617-431-6431 >> jlieberman@tumblingwalls.com <mailto:jlieberman@tumblingwalls.com> >> >> On Aug 31, 2016, at 07:29, Frans Knibbe <frans.knibbe@geodan.nl >> <mailto:frans.knibbe@geodan.nl>> wrote: >> >>> Hello, >>> >>> As stated before, I don't think the httpRange-14 problem exists >>> in our domain of discourse. I think (and hope) that confusion >>> can only occur when the things that are described are digital >>> things, or things that can be transmitted over a computer >>> network, like web pages or mail boxes. It seems to me that >>> spatial things are never that type of thing. Therefore there is >>> no reason to take precautions against possible confusion. >>> >>> That probably means +1. >>> >>> Greetings, >>> Frans >>> >>> >>> On 31 August 2016 at 09:50, Jeremy Tandy <jeremy.tandy@gmail.com >>> <mailto:jeremy.tandy@gmail.com>> wrote: >>> >>> Thanks Rob & Clemens ... >>> >>> On Wed, 31 Aug 2016 at 08:30, Clemens Portele >>> <portele@interactive-instruments.de >>> <mailto:portele@interactive-instruments.de>> wrote: >>> >>> +1 >>> >>> >>> On 30 August 2016 at 10:10:26, Jeremy Tandy >>> (jeremy.tandy@gmail.com <mailto:jeremy.tandy@gmail.com>) >>> wrote: >>> >>>> Hi. It would be good to close this issue out & include >>>> our collective recommendation in the BP doc working draft. >>>> >>>> PROPOSAL: SDW working group recommends use of "indirect >>>> identifiers" for spatial things >>>> >>>> ... I'll start the voting. >>>> >>>> +1 >>>> >>>> Jeremy >>>> >>>> (BTW, to make sense of the PROPOSAL you'll need to read >>>> the email thread) >>>> >>>> On Fri, 26 Aug 2016 at 10:12 Linda van den Brink >>>> <l.vandenbrink@geonovum.nl >>>> <mailto:l.vandenbrink@geonovum.nl>> wrote: >>>> >>>> So… do we agree we can recommend indirect >>>> identifiers, or do we try to fix the issue with >>>> getting the correct identifier as Rob describes? >>>> >>>> >>>> While waiting for this I’ve updated the issue and >>>> the text referring to the issue in BP6. >>>> >>>> >>>> *Van:* Rob Atkinson [mailto:rob@metalinkage.com.au >>>> <mailto:rob@metalinkage.com.au>] >>>> *Verzonden:* woensdag 24 augustus 2016 13:56 >>>> *Aan:* Jeremy Tandy; Phil Archer; Linda van den >>>> Brink; Bill Roberts >>>> >>>> >>>> *CC:* SDW WG Public List >>>> >>>> *Onderwerp:* Re: Clarification required: BP6 "use >>>> HTTP URIs for spatial things" >>>> >>>> >>>> Hi >>>> >>>> >>>> Agree this is a real concern - people cant be >>>> blamed for doing the obvious, if dumb, thing.. >>>> >>>> >>>> I think we should take note of best practice in the >>>> HTML world - which is often to include a citable >>>> link to a resource in the rendered view. Or a >>>> "share" or something similar. We can also put >>>> fairly explicit annotation in machine-readable code >>>> - stating that the resource is about the URI - and >>>> even notes saying when citing this resource use the >>>> URI.... >>>> >>>> >>>> I'd also like to see browsers evolve to offer you >>>> the original link or the redirected when cutting >>>> and pasting - how hard can it be! >>>> >>>> >>>> Maybe we can get Ed to ask around Google Chrome >>>> team for suggestions on how best to handle this :-) >>>> >>>> >>>> Rob >>>> >>>> >>>> >>>> >>>> On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy >>>> <jeremy.tandy@gmail.com >>>> <mailto:jeremy.tandy@gmail.com>> wrote: >>>> >>>> Yes, I think so ... And we should do so if we >>>> are recommending "indirect identification". >>>> >>>> Jeremy >>>> >>>> On Wed, 24 Aug 2016 at 09:24, Phil Archer >>>> <phila@w3.org <mailto:phila@w3.org>> wrote: >>>> >>>> Bill's comments also made me think about >>>> some of the classic arguments, >>>> such as that a lake doesn't have a last >>>> updated date and isn't 435KB >>>> big. Which are true, however, that kind of >>>> metadata generally comes from >>>> the server, i.e. the HTTP layer. That's an >>>> over simplification but the >>>> point is that it is relatively easy to >>>> avoid deliberately creating >>>> misleading metadata - metadata about the >>>> doc rather than the thing it >>>> describes - and it's also generally easy to >>>> avoid looking for that metadata. >>>> >>>> Is there scope for some BP advice there? >>>> >>>> Phil. >>>> >>>> On 24/08/2016 08:25, Jeremy Tandy wrote: >>>> > Thanks Linda. More clear examples where >>>> being "correct" (in terms of >>>> > avoiding uri collisions by using two >>>> distinct uris) is making things worse >>>> > because users take the wrong one! >>>> > >>>> > So, as a WG, are we content to recommend >>>> this "indirect identification" >>>> > pattern where thing & info resource >>>> identifiers are conflated? >>>> > >>>> > Bill has added some good points about how >>>> to avoid impacts of uri >>>> > collision- by using the (dataset) >>>> metadata to talk about licenses and >>>> > creators for the information ... >>>> > On Wed, 24 Aug 2016 at 07:52, Linda van >>>> den Brink <l.vandenbrink@geonovum.nl >>>> <mailto:l.vandenbrink@geonovum.nl>> >>>> > wrote: >>>> > >>>> >> Experience from the Netherlands: we have >>>> the id/doc pattern in our URI >>>> >> strategy, based on the Cool URIs note >>>> [8] and the ISA study on persistent >>>> >> identifiers [9]. >>>> >> >>>> >> >>>> >> >>>> >> That being said, same as Bill I also >>>> notice data users getting confused >>>> >> and generally using the /doc/ URI as >>>> that is the one they can copy from >>>> >> their browser address bar. This is not >>>> only casual confusion but also ends >>>> >> up in published information resources. >>>> >> >>>> >> >>>> >> >>>> >> You see this, for example, all over the >>>> CB-NL which is a vocabulary for >>>> >> the building sector and contains links >>>> to other Dutch standards such as >>>> >> IMGeo, an information model and >>>> vocabulary for large scale topography. E.g. >>>> >> the CB-NL concept of ‘Gebouw’ (Building) >>>> [10] links to two IMGeo concepts >>>> >> ‘Pand’ (building part) and ‘Overig >>>> Bouwwerk’ (other construction) using >>>> >> their /doc/ URIs. If you click on Pand >>>> (which doesn’t have its own landing >>>> >> page in CB-NL so I can’t include the >>>> link) you will see it includes the >>>> >> /doc/ URI as the identifier of Pand. >>>> >> >>>> >> >>>> >> >>>> >> This is an example where it occurs in >>>> vocabularies, but I also see it >>>> >> happen with identifiers for data instances. >>>> >> >>>> >> >>>> >> >>>> >> [8]: https://www.w3.org/TR/cooluris/ >>>> >> >>>> >> [9]: >>>> >> >>>> https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20-%20Study%20on%20persistent%20URIs_0.pdf >>>> >> 10: http://ont.cbnl.org/cb/def/Gebouw >>>> >> >>>> >> >>>> >> >>>> >> Linda >>>> >> >>>> >> >>>> >> >>>> >> *Van:* Jeremy Tandy >>>> [mailto:jeremy.tandy@gmail.com >>>> <mailto:jeremy.tandy@gmail.com>] >>>> >> *Verzonden:* dinsdag 23 augustus 2016 20:57 >>>> >> *Aan:* Bill Roberts >>>> >> *CC:* SDW WG Public List >>>> >> *Onderwerp:* Re: Clarification required: >>>> BP6 "use HTTP URIs for spatial >>>> >> things" >>>> >> >>>> >> >>>> >> >>>> >> Thanks Bill. Sounds very coherent ... I >>>> hoped for some responses such as >>>> >> this based on practical experience. Jeremy >>>> >> >>>> >> On Tue, 23 Aug 2016 at 19:41, Bill >>>> Roberts <bill@swirrl.com >>>> <mailto:bill@swirrl.com>> wrote: >>>> >> >>>> >> ah Jeremy, you are a brave man to poke >>>> the sleeping beast of httpRange-14. >>>> >> >>>> >> >>>> >> >>>> >> But I'll get my thoughts in early, then >>>> I can tune out of the ensuing mail >>>> >> avalanche :-) >>>> >> >>>> >> >>>> >> >>>> >> When publishing Linked Data about places >>>> we (at Swirrl) generally do the >>>> >> id/doc fandango, but to be honest I >>>> think data users either don't notice, >>>> >> or they get confused by it. In the >>>> applications we are working with (and I >>>> >> acknowledge that others may have >>>> different applications and different >>>> >> experiences), it wouldn't cause any >>>> problems to have a single URI, the 'id' >>>> >> URI if you like. We just don't find a >>>> need to say anything about the /doc/ >>>> >> URI. If we were starting again, I'd >>>> probably ditch the /doc/ and the 303 >>>> >> and rely on context and a little bit of >>>> documentation to make it clear what >>>> >> we mean. >>>> >> >>>> >> >>>> >> >>>> >> The place where we find a need to talk >>>> about creators and licences and >>>> >> modified dates is in metadata about >>>> datasets where a dataset might be a >>>> >> collection of information about a bunch >>>> of places - and we treat datasets >>>> >> as an 'information resource'. If someone >>>> requests a dataset URI we return a >>>> >> status code of 200 and the dataset >>>> metadata as the response. That metadata >>>> >> includes info on where to get all the >>>> contents of the dataset if you want >>>> >> that. >>>> >> >>>> >> >>>> >> >>>> >> By the way, though it's sensible and >>>> consistent, I find that the implied >>>> >> and parallel property stuff makes it >>>> more rather than less complicated. >>>> >> >>>> >> >>>> >> >>>> >> Bill >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> On 23 August 2016 at 17:37, Jeremy Tandy >>>> <jeremy.tandy@gmail.com >>>> <mailto:jeremy.tandy@gmail.com>> wrote: >>>> >> >>>> >> All- >>>> >> >>>> >> >>>> >> >>>> >> Linda has done a great job of >>>> consolidating the best practices are use of >>>> >> identifiers. We have just one [1] now. >>>> >> >>>> >> >>>> >> >>>> >> Reading though just now, it occurred to >>>> me that there's still an open >>>> >> issue about identifier assignment ... >>>> >> >>>> >> >>>> >> >>>> >> W3C's Architecture of the World Wide Web >>>> constraint "URIs identify a >>>> >> single resource" [2] asserts "Assign >>>> distinct URIs to distinct resources" >>>> >> in order to avoid URI collisions [2a] >>>> which "often imposes a cost in >>>> >> communication due to the effort required >>>> to resolve ambiguities". >>>> >> Discussions from earlier years in UK Gov >>>> Linked Data working group (and >>>> >> elsewhere) concluded that the "real >>>> world thing" and "information resource >>>> >> that describes the real world thing" are >>>> separate resources. I think this >>>> >> is based on a (purist?) view when >>>> working with RDF of needing to be totally >>>> >> clear on "what's the subject" of each >>>> triple ... the thing or the document. >>>> >> This manifests as URIs with `id` or >>>> `doc` included somewhere to distinguish >>>> >> between the resources and some RDF >>>> triples to clarify that the doc resource >>>> >> is talking about the thing resource etc.. >>>> >> >>>> >> >>>> >> >>>> >> (dangerously close to "httpRange-14" [3] >>>> here ... let's avoid that bear >>>> >> trap) >>>> >> >>>> >> >>>> >> >>>> >> Jeni Tennison's "URLs in Data Primer" >>>> draft TAG note captures this >>>> >> practice in §5.3 "Publishing data" [4]: >>>> >> >>>> >> >>>> >> >>>> >> ``` >>>> >> >>>> >> Publishers can help enable more accurate >>>> merging of data from different >>>> >> sites if they support URLs for each entity >>>> >> >>>> <https://www.w3.org/TR/urls-in-data/#dfn-entity> >>>> they or other sites may >>>> >> wish to describe, separate from the >>>> landing pages >>>> >> >>>> <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> >>>> or records >>>> >> >>>> <https://www.w3.org/TR/urls-in-data/#dfn-record> >>>> that they publish. >>>> >> >>>> >> ``` >>>> >> >>>> >> >>>> >> >>>> >> Yet Architecture of the World Wide Web >>>> §2.2.3 "Indirect identification" >>>> >> [5] notes that: >>>> >> >>>> >> >>>> >> >>>> >> ``` >>>> >> >>>> >> To say that the URI >>>> "mailto:nadia@example.com >>>> <mailto:nadia@example.com>" identifies both an >>>> >> Internet mailbox and Nadia, the person, >>>> introduces a URI collision. >>>> >> However, we can use the URI to >>>> indirectly identify Nadia. Identifiers are >>>> >> commonly used in this way. >>>> >> >>>> >> ``` >>>> >> >>>> >> >>>> >> >>>> >> This is consistent with what I recall >>>> TimBL saying at TPAC-2015 in regards >>>> >> to Vcard; come the finish, no one really >>>> cares to distinguish between the >>>> >> thing and its associated information >>>> resource. >>>> >> >>>> >> >>>> >> >>>> >> ... And in most cases, one can use >>>> context to determine whether a >>>> >> statement concerns the thing or the >>>> information resource. In those cases >>>> >> where you can't, "URLs in Data Primer" >>>> suggests some mechanisms to mitigate >>>> >> such confusion [6][7]. >>>> >> >>>> >> >>>> >> >>>> >> I think that in our SDW WG discussion we >>>> have concluded that we _are_ >>>> >> content to use "indirect identification" >>>> - e.g. that we use URIs that >>>> >> conflate the thing and document resource. >>>> >> >>>> >> >>>> >> >>>> >> Please can we confirm this? Assuming >>>> that indirect identification is >>>> >> "approved" as best practice, then it >>>> seems prudent to add a note to the BP >>>> >> document saying "don't worry about >>>> distinguishing between thing and >>>> >> resource; indirect identification is >>>> fine" (etc.) >>>> >> >>>> >> >>>> >> >>>> >> Thanks, Jeremy >>>> >> >>>> >> >>>> >> >>>> >> [1]: >>>> http://w3c.github.io/sdw/bp/#globally-unique-ids >>>> >> >>>> >> [2]: >>>> https://www.w3.org/TR/webarch/#pr-uri-collision >>>> >> >>>> >> [2a]: >>>> https://www.w3.org/TR/webarch/#URI-collision >>>> >> >>>> >> [3]: >>>> https://www.w3.org/2001/tag/group/track/issues/14 >>>> >> >>>> >> [4]: >>>> https://www.w3.org/TR/urls-in-data/#publishing-data >>>> >> >>>> >> [5]: >>>> https://www.w3.org/TR/webarch/#indirect-identification >>>> >> >>>> >> [6]: >>>> https://www.w3.org/TR/urls-in-data/#documenting-properties >>>> >> >>>> >> [7]: >>>> https://www.w3.org/TR/urls-in-data/#authoring-specifications >>>> >> >>>> >> >>>> >> >>>> >> >>>> > >>>> >>>> -- >>>> >>>> >>>> Phil Archer >>>> W3C Data Activity Lead >>>> http://www.w3.org/2013/data/ >>>> >>>> http://philarcher.org <http://philarcher.org/> >>>> +44 (0)7887 767755 >>>> <tel:%2B44%20%280%297887%20767755> >>>> @philarcher1 >>>> >>> > -- Krzysztof Janowicz Geography Department, University of California, Santa Barbara 4830 Ellison Hall, Santa Barbara, CA 93106-4060 Email: jano@geog.ucsb.edu Webpage: http://geog.ucsb.edu/~jano/ Semantic Web Journal: http://www.semantic-web-journal.net
Received on Thursday, 1 September 2016 21:43:19 UTC