- From: Frans Knibbe <frans.knibbe@geodan.nl>
- Date: Fri, 2 Sep 2016 11:20:56 +0200
- To: Krzysztof Janowicz <janowicz@ucsb.edu>
- Cc: Joshua Lieberman <jlieberman@tumblingwalls.com>, Jeremy Tandy <jeremy.tandy@gmail.com>, SDW WG Public List <public-sdw-wg@w3.org>
- Message-ID: <CAFVDz428kocfW6DqxU+AeuisPvrm92QyCZox4WkfgvtaLNJnfg@mail.gmail.com>
On 1 September 2016 at 23:42, Krzysztof Janowicz <janowicz@ucsb.edu> wrote: > > Hi, > > So as representations, these are not “owl:sameAs”. > > > > Just for clarification. owl:sameAs is only concerned with the mapping of > IRIs to (real world) entities and not 'representations' (leaving aside the > fact that everything is a representation in some sense). I.e., it is about > 'identity'. To give an extreme example, a URI may refer to the Eddystone > Lighthouse which may be classified as /Lighthouse/ in some repository. > Another URI established 50 years from now can still refer to this > particular (4th) lighthouse and classify it as a /Ruin/. Another 50 years > into the future, there may be yet another URI that refers to the fact that > at some stage there was a ruin here of the 4th lighthouse called Eddystone > while there is nothing physical left of it, and, thus, it is neither > classified as /Ruin/ nor /Lighthouse/. In fact, we do not even need to > introduce the concept of "real world" here as we can also establish a > sameAs relation between two URIs that point to Zeus. Please note that this > is different from establish a sameAs link between a particular statue of > Zeus in a particular museum and Zeus as the god of thunder. Finally, the > purpose of establishing sameAs links is typically data fusion/conflation > (no matter whether this is done ad-hoc, manually, or (offline) > computationally) . > I am no expert on the matter, but several sources tell me that if <A> <owl:sameAs> <B>, then all statements that can be made about A will also be true for B, and vice versa. It seems that the lighthouse example breaks at that point. For example, in Jeremy's example one of the lighthouse representations has a height of 41 m. It is likely that that statement will be false for the representation of the lighthouse as a ruin. Can we be sure that if we recommend using owl:sameAs to assert that two resources are really the same thing, everyone and everything is aware of the logical consequences? Regards, Frans > > Best, > Jano > > > On 08/31/2016 06:38 AM, Joshua Lieberman wrote: > > Jeremy, > > So as representations, these are not “owl:sameAs”. We assume that as > feature data, each refers to a real world entity, but we don’t assert that > this VerticalObstruction is the same individual as this > MaritimeNavigationAid. We just are suspecting or asserting that the same > real world thing is being discerned in two different ways. Someone may > define a lighthouse class as subclassing both, otherwise a slightly > specialized relation (e.g. sdwgeo:sameRealWorldEntityAs) would be useful > here. > > Josh > > On Aug 31, 2016, at 8:41 AM, Jeremy Tandy <jeremy.tandy@gmail.com> wrote: > > > That still leaves a gap in expressing whether two feature data entities > represent the same real world entity. Perhaps we need a "sameFeatureAs" > predicate to address this. > > @josh - can we clarify my understanding please? > > In the BP doc §4 "Spatial things, features and geometry" [1] I use a > lighthouse example, so I'll continue with that ... > > We have one real lighthouse (Eddystone Lighthouse) that is discerned as a > different Type by different communities: "VerticalObstruction" and > "MaritimeNavigationAid". In ISO 19100 parlance, these are two distinct > feature types. The two "Features" might be encoded in GML as follows > (forgive any errors in my illustrative example): > > <VerticalObstruction gml:id="a"> > <gml:name>Eddystone</gml:name> > <gml:identifier codeSpace="http://example.com/ > sar/features/vo/">EDY</gml:identifier> > <geometry> > <gml:Point gml:id="a-p1" srsDimension="2" srsName="EPSG:4326"> > <gml:pos>50.184 -4.268</gml:pos> > </gml:Point> > </geometry> > <height uom="m">41</height> > </VerticalObstruction> > > <MaritimeNavigationAid gml:id="b"> > <gml:name>Eddystone Lighthouse</gml:name> > <gml:identifier codeSpace="http://example.org/maritime/navaid/ > ">2650253</gml:identifier> > <geo> > <gml:Point gml:id="b-p1" srsDimension="2" srsName="EPSG:4326"> > <gml:pos>50.2 -4.3</gml:pos> > </gml:Point> > </geo> > <lightCharacteristic> > ... > </lightCharacteristic> > </MaritimeNavigationAid> > > So we have two Features (which we collectively have agreed are "spatial > things"), with identifiers <http://example.com/sar/features/vo/EDY> and < > http://example.org/maritime/navaid/2650253>. Respectively, the XML > elements that describe these features are identified as "a" and "b" using > the @gml:id attribute. > > If we are using "indirect identification" then _both_ <http://example.com/ > sar/features/vo/EDY> and <http://example.org/maritime/navaid/2650253> are > treated as identifiers for the _real_ Eddystone Lighthouse; we simply don't > care to differentiate between the real world thing and the information > record. In which case, <owl:sameAs> would seem sufficient? The "height" > and "lightCharacteristic" properties are both applicable to the real > Eddystone Lighthouse. Some judgement would be required to decide which > point geometry ("geo" or "geometry" property) is considered "best". > > The way I think about it, @gml:id is more like the identifier for a named > graph; a container for a set of properties ... > > Am I missing something??? > > Jeremy > > > [1]: http://w3c.github.io/sdw/bp/#spatial-things-features-and-geometry > > On Wed, 31 Aug 2016 at 12:42 Joshua Lieberman < > jlieberman@tumblingwalls.com> wrote: > >> If we are asserting that spatial data on the Web is "always" feature data >> that represents a real world entity, then yes, we don't have the general >> Web "is it or isn't it physical" ambiguity and can assume that a feature >> data identifier also and indirectly identifies the feature. That still >> leaves a gap in expressing whether two feature data entities represent the >> same real world entity. Perhaps we need a "sameFeatureAs" predicate to >> address this. >> >> Josh >> >> Joshua Lieberman, Ph.D. >> Principal, Tumbling Walls Consultancy >> Tel/Direct: +1 617-431-6431 >> jlieberman@tumblingwalls.com >> >> On Aug 31, 2016, at 07:29, Frans Knibbe <frans.knibbe@geodan.nl> wrote: >> >> Hello, >> >> As stated before, I don't think the httpRange-14 problem exists in our >> domain of discourse. I think (and hope) that confusion can only occur when >> the things that are described are digital things, or things that can be >> transmitted over a computer network, like web pages or mail boxes. It seems >> to me that spatial things are never that type of thing. Therefore there is >> no reason to take precautions against possible confusion. >> >> That probably means +1. >> >> Greetings, >> Frans >> >> >> >> On 31 August 2016 at 09:50, Jeremy Tandy <jeremy.tandy@gmail.com> wrote: >> >>> Thanks Rob & Clemens ... >>> >>> On Wed, 31 Aug 2016 at 08:30, Clemens Portele <portele@interactive- >>> instruments.de> wrote: >>> >>>> +1 >>>> >>>> >>>> On 30 August 2016 at 10:10:26, Jeremy Tandy (jeremy.tandy@gmail.com) >>>> wrote: >>>> >>>> Hi. It would be good to close this issue out & include our collective >>>> recommendation in the BP doc working draft. >>>> >>>> PROPOSAL: SDW working group recommends use of "indirect identifiers" >>>> for spatial things >>>> >>>> ... I'll start the voting. >>>> >>>> +1 >>>> >>>> Jeremy >>>> >>>> (BTW, to make sense of the PROPOSAL you'll need to read the email >>>> thread) >>>> >>>> On Fri, 26 Aug 2016 at 10:12 Linda van den Brink < >>>> l.vandenbrink@geonovum.nl> wrote: >>>> >>>>> So… do we agree we can recommend indirect identifiers, or do we try to >>>>> fix the issue with getting the correct identifier as Rob describes? >>>>> >>>>> >>>>> While waiting for this I’ve updated the issue and the text referring >>>>> to the issue in BP6. >>>>> >>>>> >>>>> *Van:* Rob Atkinson [mailto:rob@metalinkage.com.au] >>>>> *Verzonden:* woensdag 24 augustus 2016 13:56 >>>>> *Aan:* Jeremy Tandy; Phil Archer; Linda van den Brink; Bill Roberts >>>>> >>>>> >>>>> *CC:* SDW WG Public List >>>>> >>>>> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for >>>>> spatial things" >>>>> >>>>> >>>>> Hi >>>>> >>>>> >>>>> Agree this is a real concern - people cant be blamed for doing the >>>>> obvious, if dumb, thing.. >>>>> >>>>> >>>>> I think we should take note of best practice in the HTML world - which >>>>> is often to include a citable link to a resource in the rendered view. Or >>>>> a "share" or something similar. We can also put fairly explicit annotation >>>>> in machine-readable code - stating that the resource is about the URI - and >>>>> even notes saying when citing this resource use the URI.... >>>>> >>>>> >>>>> I'd also like to see browsers evolve to offer you the original link or >>>>> the redirected when cutting and pasting - how hard can it be! >>>>> >>>>> >>>>> Maybe we can get Ed to ask around Google Chrome team for suggestions >>>>> on how best to handle this :-) >>>>> >>>>> >>>>> Rob >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, 24 Aug 2016 at 18:27 Jeremy Tandy <jeremy.tandy@gmail.com> >>>>> wrote: >>>>> >>>>> Yes, I think so ... And we should do so if we are recommending >>>>> "indirect identification". >>>>> >>>>> Jeremy >>>>> >>>>> On Wed, 24 Aug 2016 at 09:24, Phil Archer <phila@w3.org> wrote: >>>>> >>>>> Bill's comments also made me think about some of the classic arguments, >>>>> such as that a lake doesn't have a last updated date and isn't 435KB >>>>> big. Which are true, however, that kind of metadata generally comes >>>>> from >>>>> the server, i.e. the HTTP layer. That's an over simplification but the >>>>> point is that it is relatively easy to avoid deliberately creating >>>>> misleading metadata - metadata about the doc rather than the thing it >>>>> describes - and it's also generally easy to avoid looking for that >>>>> metadata. >>>>> >>>>> Is there scope for some BP advice there? >>>>> >>>>> Phil. >>>>> >>>>> On 24/08/2016 08:25, Jeremy Tandy wrote: >>>>> > Thanks Linda. More clear examples where being "correct" (in terms of >>>>> > avoiding uri collisions by using two distinct uris) is making things >>>>> worse >>>>> > because users take the wrong one! >>>>> > >>>>> > So, as a WG, are we content to recommend this "indirect >>>>> identification" >>>>> > pattern where thing & info resource identifiers are conflated? >>>>> > >>>>> > Bill has added some good points about how to avoid impacts of uri >>>>> > collision- by using the (dataset) metadata to talk about licenses and >>>>> > creators for the information ... >>>>> > On Wed, 24 Aug 2016 at 07:52, Linda van den Brink < >>>>> l.vandenbrink@geonovum.nl> >>>>> > wrote: >>>>> > >>>>> >> Experience from the Netherlands: we have the id/doc pattern in our >>>>> URI >>>>> >> strategy, based on the Cool URIs note [8] and the ISA study on >>>>> persistent >>>>> >> identifiers [9]. >>>>> >> >>>>> >> >>>>> >> >>>>> >> That being said, same as Bill I also notice data users getting >>>>> confused >>>>> >> and generally using the /doc/ URI as that is the one they can copy >>>>> from >>>>> >> their browser address bar. This is not only casual confusion but >>>>> also ends >>>>> >> up in published information resources. >>>>> >> >>>>> >> >>>>> >> >>>>> >> You see this, for example, all over the CB-NL which is a vocabulary >>>>> for >>>>> >> the building sector and contains links to other Dutch standards >>>>> such as >>>>> >> IMGeo, an information model and vocabulary for large scale >>>>> topography. E.g. >>>>> >> the CB-NL concept of ‘Gebouw’ (Building) [10] links to two IMGeo >>>>> concepts >>>>> >> ‘Pand’ (building part) and ‘Overig Bouwwerk’ (other construction) >>>>> using >>>>> >> their /doc/ URIs. If you click on Pand (which doesn’t have its own >>>>> landing >>>>> >> page in CB-NL so I can’t include the link) you will see it includes >>>>> the >>>>> >> /doc/ URI as the identifier of Pand. >>>>> >> >>>>> >> >>>>> >> >>>>> >> This is an example where it occurs in vocabularies, but I also see >>>>> it >>>>> >> happen with identifiers for data instances. >>>>> >> >>>>> >> >>>>> >> >>>>> >> [8]: https://www.w3.org/TR/cooluris/ >>>>> >> >>>>> >> [9]: >>>>> >> https://joinup.ec.europa.eu/sites/default/files/D7.1.3%20- >>>>> %20Study%20on%20persistent%20URIs_0.pdf >>>>> >> 10: http://ont.cbnl.org/cb/def/Gebouw >>>>> >> >>>>> >> >>>>> >> >>>>> >> Linda >>>>> >> >>>>> >> >>>>> >> >>>>> >> *Van:* Jeremy Tandy [mailto:jeremy.tandy@gmail.com] >>>>> >> *Verzonden:* dinsdag 23 augustus 2016 20:57 >>>>> >> *Aan:* Bill Roberts >>>>> >> *CC:* SDW WG Public List >>>>> >> *Onderwerp:* Re: Clarification required: BP6 "use HTTP URIs for >>>>> spatial >>>>> >> things" >>>>> >> >>>>> >> >>>>> >> >>>>> >> Thanks Bill. Sounds very coherent ... I hoped for some responses >>>>> such as >>>>> >> this based on practical experience. Jeremy >>>>> >> >>>>> >> On Tue, 23 Aug 2016 at 19:41, Bill Roberts <bill@swirrl.com> wrote: >>>>> >> >>>>> >> ah Jeremy, you are a brave man to poke the sleeping beast of >>>>> httpRange-14. >>>>> >> >>>>> >> >>>>> >> >>>>> >> But I'll get my thoughts in early, then I can tune out of the >>>>> ensuing mail >>>>> >> avalanche :-) >>>>> >> >>>>> >> >>>>> >> >>>>> >> When publishing Linked Data about places we (at Swirrl) generally >>>>> do the >>>>> >> id/doc fandango, but to be honest I think data users either don't >>>>> notice, >>>>> >> or they get confused by it. In the applications we are working >>>>> with (and I >>>>> >> acknowledge that others may have different applications and >>>>> different >>>>> >> experiences), it wouldn't cause any problems to have a single URI, >>>>> the 'id' >>>>> >> URI if you like. We just don't find a need to say anything about >>>>> the /doc/ >>>>> >> URI. If we were starting again, I'd probably ditch the /doc/ and >>>>> the 303 >>>>> >> and rely on context and a little bit of documentation to make it >>>>> clear what >>>>> >> we mean. >>>>> >> >>>>> >> >>>>> >> >>>>> >> The place where we find a need to talk about creators and licences >>>>> and >>>>> >> modified dates is in metadata about datasets where a dataset might >>>>> be a >>>>> >> collection of information about a bunch of places - and we treat >>>>> datasets >>>>> >> as an 'information resource'. If someone requests a dataset URI we >>>>> return a >>>>> >> status code of 200 and the dataset metadata as the response. That >>>>> metadata >>>>> >> includes info on where to get all the contents of the dataset if >>>>> you want >>>>> >> that. >>>>> >> >>>>> >> >>>>> >> >>>>> >> By the way, though it's sensible and consistent, I find that the >>>>> implied >>>>> >> and parallel property stuff makes it more rather than less >>>>> complicated. >>>>> >> >>>>> >> >>>>> >> >>>>> >> Bill >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> On 23 August 2016 at 17:37, Jeremy Tandy <jeremy.tandy@gmail.com> >>>>> wrote: >>>>> >> >>>>> >> All- >>>>> >> >>>>> >> >>>>> >> >>>>> >> Linda has done a great job of consolidating the best practices are >>>>> use of >>>>> >> identifiers. We have just one [1] now. >>>>> >> >>>>> >> >>>>> >> >>>>> >> Reading though just now, it occurred to me that there's still an >>>>> open >>>>> >> issue about identifier assignment ... >>>>> >> >>>>> >> >>>>> >> >>>>> >> W3C's Architecture of the World Wide Web constraint "URIs identify a >>>>> >> single resource" [2] asserts "Assign distinct URIs to distinct >>>>> resources" >>>>> >> in order to avoid URI collisions [2a] which "often imposes a cost in >>>>> >> communication due to the effort required to resolve ambiguities". >>>>> >> Discussions from earlier years in UK Gov Linked Data working group >>>>> (and >>>>> >> elsewhere) concluded that the "real world thing" and "information >>>>> resource >>>>> >> that describes the real world thing" are separate resources. I >>>>> think this >>>>> >> is based on a (purist?) view when working with RDF of needing to be >>>>> totally >>>>> >> clear on "what's the subject" of each triple ... the thing or the >>>>> document. >>>>> >> This manifests as URIs with `id` or `doc` included somewhere to >>>>> distinguish >>>>> >> between the resources and some RDF triples to clarify that the doc >>>>> resource >>>>> >> is talking about the thing resource etc.. >>>>> >> >>>>> >> >>>>> >> >>>>> >> (dangerously close to "httpRange-14" [3] here ... let's avoid that >>>>> bear >>>>> >> trap) >>>>> >> >>>>> >> >>>>> >> >>>>> >> Jeni Tennison's "URLs in Data Primer" draft TAG note captures this >>>>> >> practice in §5.3 "Publishing data" [4]: >>>>> >> >>>>> >> >>>>> >> >>>>> >> ``` >>>>> >> >>>>> >> Publishers can help enable more accurate merging of data from >>>>> different >>>>> >> sites if they support URLs for each entity >>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-entity> they or other >>>>> sites may >>>>> >> wish to describe, separate from the landing pages >>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-landing-page> or records >>>>> >> <https://www.w3.org/TR/urls-in-data/#dfn-record> that they publish. >>>>> >> >>>>> >> ``` >>>>> >> >>>>> >> >>>>> >> >>>>> >> Yet Architecture of the World Wide Web §2.2.3 "Indirect >>>>> identification" >>>>> >> [5] notes that: >>>>> >> >>>>> >> >>>>> >> >>>>> >> ``` >>>>> >> >>>>> >> To say that the URI "mailto:nadia@example.com" identifies both an >>>>> >> Internet mailbox and Nadia, the person, introduces a URI collision. >>>>> >> However, we can use the URI to indirectly identify Nadia. >>>>> Identifiers are >>>>> >> commonly used in this way. >>>>> >> >>>>> >> ``` >>>>> >> >>>>> >> >>>>> >> >>>>> >> This is consistent with what I recall TimBL saying at TPAC-2015 in >>>>> regards >>>>> >> to Vcard; come the finish, no one really cares to distinguish >>>>> between the >>>>> >> thing and its associated information resource. >>>>> >> >>>>> >> >>>>> >> >>>>> >> ... And in most cases, one can use context to determine whether a >>>>> >> statement concerns the thing or the information resource. In those >>>>> cases >>>>> >> where you can't, "URLs in Data Primer" suggests some mechanisms to >>>>> mitigate >>>>> >> such confusion [6][7]. >>>>> >> >>>>> >> >>>>> >> >>>>> >> I think that in our SDW WG discussion we have concluded that we >>>>> _are_ >>>>> >> content to use "indirect identification" - e.g. that we use URIs >>>>> that >>>>> >> conflate the thing and document resource. >>>>> >> >>>>> >> >>>>> >> >>>>> >> Please can we confirm this? Assuming that indirect identification is >>>>> >> "approved" as best practice, then it seems prudent to add a note to >>>>> the BP >>>>> >> document saying "don't worry about distinguishing between thing and >>>>> >> resource; indirect identification is fine" (etc.) >>>>> >> >>>>> >> >>>>> >> >>>>> >> Thanks, Jeremy >>>>> >> >>>>> >> >>>>> >> >>>>> >> [1]: http://w3c.github.io/sdw/bp/#globally-unique-ids >>>>> >> >>>>> >> [2]: https://www.w3.org/TR/webarch/#pr-uri-collision >>>>> >> >>>>> >> [2a]: https://www.w3.org/TR/webarch/#URI-collision >>>>> >> >>>>> >> [3]: https://www.w3.org/2001/tag/group/track/issues/14 >>>>> >> >>>>> >> [4]: https://www.w3.org/TR/urls-in-data/#publishing-data >>>>> >> >>>>> >> [5]: https://www.w3.org/TR/webarch/#indirect-identification >>>>> >> >>>>> >> [6]: https://www.w3.org/TR/urls-in-data/#documenting-properties >>>>> >> >>>>> >> [7]: https://www.w3.org/TR/urls-in-data/#authoring-specifications >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> > >>>>> >>>>> -- >>>>> >>>>> >>>>> Phil Archer >>>>> W3C Data Activity Lead >>>>> http://www.w3.org/2013/data/ >>>>> >>>>> http://philarcher.org >>>>> +44 (0)7887 767755 <%2B44%20%280%297887%20767755> >>>>> @philarcher1 >>>>> >>>>> >> > > > -- > Krzysztof Janowicz > > Geography Department, University of California, Santa Barbara > 4830 Ellison Hall, Santa Barbara, CA 93106-4060 > > Email: jano@geog.ucsb.edu > Webpage: http://geog.ucsb.edu/~jano/ > Semantic Web Journal: http://www.semantic-web-journal.net > >
Received on Friday, 2 September 2016 09:21:36 UTC