W3C home > Mailing lists > Public > public-sdwig@w3.org > October 2018

A universal model for spatial data

From: Knibbe, Frans <Frans.Knibbe@kadaster.nl>
Date: Wed, 24 Oct 2018 08:11:04 +0000
To: "public-sdwig@w3.org" <public-sdwig@w3.org>
CC: "Folmer, Erwin" <Erwin.Folmer@kadaster.nl>, "Stoter, Jantien" <Jantien.Stoter@kadaster.nl>
Message-ID: <51cb8cd1-07bc-4b63-907d-632ea804b194@KAD-DC3-MBX-001.kadaster.local>
Hello all,

Warning: long text ahead. In short, I try to argue that it would be a good idea to have a domain-independent web ontology for spatial data.

After having to cut back my participation in the Spatial Data on the Web Working Group (SDWWG) I have been (partially) working for the Dutch Cadastre, who are doing a great job at publishing geographic data on the web as Linked Data. Examples of such datasets are buildings, addresses, cadastral parcels, governmental spatial plans and large scale (i.e. detailed) topography, all with national coverage. In working on these efforts, and trying to make those data work for society, a familiar and very basic problem keeps turning up: the various domain models for geometry have poor interoperability. Consequently, the same can be said for data formats. I believe this is very harmful for getting spatial data on the web to really work. And by extension, it is harmful for the web of data itself.

It was not possible to address this issue fully in the SDWWG, but I am glad to see the charter of the Spatial Data on the Web Interest Group fully supportive of what I would like to suggest in this message: That a universal basic web ontology for spatial data should be developed.

The current problem
We live in an age where data from many different sources can live together in one information system: the world wide web. Data are interlinked and self-describing, making it possible to decouple publication of data from fixed ways of putting data to use. People as well as machines are free to mix and process data as they please. Many types of usage will involve space, in one way or another, because  space is a fundamental aspect of our reality. Consequently, many data have spatial aspects. But spatial data are modelled in many different ways, and can appear in many different formats. This diversity is a result of historical developments. Before the foundations for a global web of data were in place, there was a need to digitize spatial data in many information domains, which led to development of different ways of digitizing what is essentially the same thing, but looked at from different perspectives. For example, there is the domain of geography, from which the current set of standards of the OGC spring. The domain of building construction also deals with things related to the Earth's surface, but its standards have roots in CAD, leading to very different ways of specifying geometries. Then there the domain of 2D and 3D graphics, which also deals with spatial objects, but related to different reference systems: a sheet of paper, a computer screen or a virtual space, and is heavily focused on appearance. The transport domain has yet another focus: it is primarily concerned with network connectivity, leading to a graph-based view of spatial information. That is just mentioning a few domains that I am familiar with, probably there are more domains in IT and science that have developed their own ways of coding space.

Different domain models and data formats may function well within their respective domains, but real life problems requiring sound solutions are likely not limited to a certain domain. The restrictiveness of domain standards has always existed, but comes to light more clearly now we have the means to work with data irrespective of their origin.

Can Time set the pace for Space?
As a data type, time has much in common with space. It is a universal reality that is always present in everyday life, and has therefore been described in many domain models. In a recent effort coordinated in the SDWWG, a universal model for time was made available: the Time Ontology in OWL<https://www.w3.org/TR/owl-time/>. It can be used to unify many different ways of how people have historically coded time instants and intervals. I think the Time Ontology could be an inspiration for a Space Ontology. Seemingly, it was possible to unify different ways of expressing information about time by going to the mathematical roots of the phenomenon. Mathematics is a truly domain-independent science and it is great for reducing everyday phenomena to their most basic and simple forms, using a language that all people on earth are able to speak, irrespective of their place of birth. I think going to the mathematical roots of space is what is required to come to a universal model for spatial information on the web.

Doing the maths
When looking at different ways spatial data are encoded, it seems to me that there are three basic ingredients needed. All of them can be expressed mathematically:

1) The notion of a spatial reference system. For geographers that will be some kind of model of the Earth's surface. For astronomers some kind of model of the solar system, the galaxy or something even bigger. Physicists and chemists studying other phenomena might have a need for much smaller reference systems. For graphic designers an arbitrary 2D or 3D space needs to be agreed upon. For architects and engineers it could be a building plot or a building. The common ground seems to be that in order to define a spatial thing, first a frame of reference, a coordinate space, needs to be defined. Being able to do so using universal semantics would do a lot of good for data interoperability and transformation of spatial data between different reference systems.
2) The notion of coding the shape or spatial distribution of a thing in numbers. Related to this is the concept of spatial resolution, or the idea that when spatial data represent a real world phenomenon the numbers used will always be an approximation.
3) The notion of functions that work on numerical definitions of shapes or spatial distributions. One group of such functions would be topological relations between geometries. Other functions could define how to extrude a 2D shape to a 3D shape. Still other functions could be used to add or subtract shapes. And much more is needed and possible.

These three ingredients depend on each other: a spatial reference system is needed to define shapes of things, and a way of defining shapes by numbers is needed to define functions working on those shapes. I hope it is possible to combine the three ingredients in a single model that is mathematical at its core, giving it the ability to be used at varying levels of complexity, with basic usage (e.g. defining a point location in a 2D space) being very simple.

I can imagine that when such a shared model is in place it will be much easier to derive data types and data formats that are truly interoperable because they all have the same mathematical foundations. And probably we could do with far less data types and data formats too. That should be a great boost for developing software that can work with spatial data, on the web and elsewhere.

Final words
Ok, that is the idea I wanted to float. I hope it makes some kind of sense, but it would also be interesting to know if there are flaws in the reasoning. Of course, should people see the merit, a next question could how to make such a thing happen. Without going into detail about that issue, I just would like to note that a lot of what is needed already exists, and that the OGC-W3C combo seems to be just the right environment to make it happen.


De inhoud van dit bericht is uitsluitend bestemd voor geadresseerde.
Gebruik van de inhoud van dit bericht door anderen zonder toestemming van het Kadaster
is onrechtmatig. Mocht dit bericht ten onrechte bij u terecht komen, dan verzoeken wij u
dit direct te melden aan de verzender en het bericht te vernietigen.
Aan de inhoud van dit bericht kunnen geen rechten worden ontleend.

The content of this message is meant to be received by the addressee only.
Use of the content of this message by anyone other than the addressee without the consent
of the Kadaster is unlawful. If you have received this message, but are not the addressee,
please contact the sender immediately and destroy the message.
No rights can be derived from the content of this message.
Received on Wednesday, 24 October 2018 13:24:13 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:17:51 UTC