- From: Andrea Perego <andrea.perego@jrc.ec.europa.eu>
- Date: Wed, 18 Jan 2017 10:21:27 +0100
- To: Linda van den Brink <l.vandenbrink@geonovum.nl>, W3C SDW WG - Public <public-sdw-wg@w3.org>
Dear Linda, all, In view of the relevant agenda item of today's call [1], I provide below a summary of the discussions we had so far on how to publish geometries on the Web. Apologies in advance for the long email. @All, I kindly ask you to check if what reported here is correct. Any comments & revisions are more than welcome. 1. Preferred geometry format(s) BP8 includes already some guidelines, but based on some discussion during the last f2f [2,3], it seems that more explicit recommendations would be desirable. What follows is a tentative contribution based on what I recall from our discussions. The reference BP here is the general DWBP B14 principle of providing data in multiple formats: https://www.w3.org/TR/dwbp/#MultipleFormats Applied to geometries, this should ideally imply providing geometries in the most used serialisations. However, this may not be always feasible, so it is important to identify one or more preferred geometry serialisations. One of the requirements here is that such serialisations should be preferably Web-friendly. More importantly, we don't want to be prescriptive and prevent people from publishing geometries in their preferred serialisations. So, the recommendation should sound like: "Publish geometries in any serialisation you like, but for re-usability it is important that you make them available in format X [, Y, Z, ...]." Most on the discussions we had on "preferred format(s)" were about two possible candidates: WKT and GeoJSON. Both are widely used and supported. GeoJSON is the most webby one, but also WKT is supported by popular Web libraries (as OpenLayers and Leaflet). Moreover, WKT is also supported by most triple stores - even those not supporting GeoSPARQL. The main drawbacks with GeoJSON seem to be related to the fact that in its current version it supports only one CRS - namely, CRS84 (i.e., WGS84, with lon/lat axis order) - see: https://tools.ietf.org/html/rfc7946#section-4 WKT doesn't have this problem, and has other advantages - it's a very compact literal form compared to GeoJSON (as well as other geometry serialisations), it's case insensitive, and it has a corresponding binary encoding (WKB). There are however a number of issues: - WKT is available in a number of flavours - e.g., the original WKT format, the extended variant supported in PostGIS (EWKT) [2], the GeoSPARQL variant - The axis order is implemented inconsistently. For instance, in PostGIS, by default it's lon/lat, irrespective of the CRS, whereas GeoSPARQL requires the use of the axis order specified in the CRS It has been pointed out that GML does not have the issues above, since both CRS and axis order can be explicitly specified. However, my understanding of the relevant discussion is that GML is not considered webby enough, and it has limited support in Web / LD applications, tools and platforms. Trying to come to a conclusion, this is my personal understanding: (a) We cannot avoid recommending GeoJSON as (one of) the preferred geometry serialisation(s), because of its widespread use and support on the Web. But with the caveat that it may not be suitable for all use cases, due to the CRS issue. (b) We need also a geometry serialisation not having the GeoJSON issues. Between WKT and GML, the former seems to be definitely more suitable for Web and LD applications. But in this case we need to decide which variant should be recommended, and the rule about the axis order. 2. How to publish geometries on the Web This point is of course related to the principle of "publishing geometries for Web use", but also to the idea of making geometries a "first-class citizen" on the Web. I think that issue here boils done to whether geometries should be published along with the relevant spatial things, or independently. There was some discussion in the last f2f [2,3] about the two options of denoting geometries with blank nodes or URIs. Linda provided an example from Ireland, where the rationale about using blank nodes is that the data provider would like people to link to their spatial things, and not to the geometries. From this perspective, using URIs for geometries should be based on use cases where people would like to link instead to the geometry itself. Which, I think, is basically related to the question whether / in which cases a geometry is "re-usable". A possible use case is when you need to link to some "authoritative" geometry - e.g., an administrative boundary maintained by an institutional agency. Using the relevant URI would ensure not only that I'm referring always to the official and up to date version of the geometry, but I implicitly provide provenance information. This is not different from linking to and re-using data maintained by external organisations - e.g., as in the work illustrated by Bart in Lisbon, where fire depts. re-use cadastral data not by copying them locally, but linking to them. So, IMO, in BP8 we should mention both approaches, clarifying the different use cases they are addressing. And we can also mention that, depending on the solution used, the publication of geometries in multiple serialisations is different - e.g., for geometry URIs, HTTP conneg can be used. 3. Geometries in RDF The main points of discussion seem have been focussed on the following topics: (a) Recommended vocabularies and/or best practices for using them (b) Which information should be included in the RDF representation of a geometry In general, both relate to Josh's work on the revision of GeoSPARQL. However, as far as existing vocabularies are concerned, my understanding is that the only consolidated agreement we have is the use of Basic Geo for point geometries. For other geometry types, bboxes, centroids, etc. we suggest a number of options. The question is whether this is enough, or we should instead provide some more specific recommendations. I can try to collect a number of examples from the reference vocabularies, if this may be helpful. About point (b), there was some discussion during the last f2f [2,3], but I'm not sure an agreement was reached. One point quite controversial from the very beginning is whether the RDF representation should include the CRS separately from the geometry specification (this can be very much dependent on how a geometry is modelled). Another issue was about linking a geometry to related geometries (which seems to imply the use of URIs for geometries). I think here it would be crucial to have real-world examples as a starting point, and possibly suggest how the can be improved. Thanks, and sorry again for the long mail. Meet you later Andrea ---- [1]https://lists.w3.org/Archives/Public/public-sdw-wg/2017Jan/0067.html [2]https://www.w3.org/2016/12/15-sdw-minutes [3]https://www.w3.org/2016/12/16-sdw-minutes [4]http://postgis.net/docs/ST_GeomFromEWKT.html -- Andrea Perego, Ph.D. Scientific / Technical Project Officer European Commission DG JRC Directorate B - Growth and Innovation Unit B6 - Digital Economy Via E. Fermi, 2749 - TP 262 21027 Ispra VA, Italy https://ec.europa.eu/jrc/ ---- The views expressed are purely those of the writer and may not in any circumstances be regarded as stating an official position of the European Commission.
Received on Wednesday, 18 January 2017 09:21:33 UTC