- From: Kostis Kyzirakos <Kostis.Kyzirakos@cwi.nl>
- Date: Fri, 3 Jan 2014 16:34:50 +0100
- To: "Frans Knibbe | Geodan" <frans.knibbe@geodan.nl>
- Cc: LocAdd W3C CG Public Mailing list <public-locadd@w3.org>
- Message-ID: <CAJUi=VGAwvzqP6VupgA3UxOHWVft=paGrCB8bbmh=0yBvd4eYw@mail.gmail.com>
Hi, Please find some answers inline. Cheers, Kostis =================================================== Kostis E. Kyzirakos, Ph.D. Centrum voor Wiskunde en Informatica DB Architectures (DA) Office L320 Science Park 123 1098 XG Amsterdam (NL) tel: +31 (20) 592-4039 mobile: +31 (0) 6422-95345 e-mail: kostis@cwi.nl =================================================== On Fri, Jan 3, 2014 at 11:48 AM, Frans Knibbe | Geodan < frans.knibbe@geodan.nl> wrote: > Hello, > > I agree that a sequence of coordinates should be associated with a CRS. In > my opinion, that is exactly what happens in the example I gave: > > > ex1:myGeometry > a ex2:geometry ; > ex2:asWKT "POLYGON((97372 487152,97372 580407,149636 580407,149636 > 487152,97372 487152))"^^ex2:wktLiteral ; > ex2:CRS <http://www.opengis.net/def/crs/EPSG/0/28992><http://www.opengis.net/def/crs/EPSG/0/28992>; > > This is based on the viewpoint of a geometry consisting of a sequence of > coordinates and a CRS. They are both properties of a geometry. > What happens though if you merge the following graphs: ex1:myGeometry ex2:asWKT "POLYGON((97372 487152,97372 580407,149636 580407,149636 487152,97372 487152))"^^ex2:wktLiteral ; ex2:CRS <http://www.opengis.net/def/crs/EPSG/0/28992><http://www.opengis.net/def/crs/EPSG/0/28992>. ex1:myGeometry ex2:asWKT " POLYGON((4.54103559631648 52.369221013436,4.52469206625503 53.2071950151372,5.30692041653441 53.210266927542,5.30843905756704 52.3722183594399,4.54103559631648 52.369221013436))"^^ex2:wktLiteral ; ex2:CRS < http://www.opengis.net/def/crs/EPSG/0/4326> . One could argue that we can avoid such problems by using different URI for different serializations. However, combination of different datasets becomes problematic since constraints are introduced... > This is somewhat similar to the decision made in NeoGeo to keep the > coordinates separate objects. > This representation is excellent for specific application domains e.g., when computing shortest paths. IMHO the problem with this approach has to do with querying such data in a broader domain and not only for a single domain. How can you express queries like: find all archaeological sites (polygon) that are within a municipality (polygon) that is neighboring with the municipality of Athens (polygon) and are near a beach (linestring)? > In some scenarios, it may be more convenient to model a text as an ordered > sequence of character elements. In most cases, the text will be used as a > whole, without any need to process the individual characters. So the first > example is more convenient than the second. Now let us try to put the word > "Αθήνα" in perspective by including its script and language: > > ex1:myFeature > a ex2:name ; > ex3:spelling "Αθήνα"^^xsd:string ; > ex3:language "Greek" ; > ex3:script "Greek" . > > The extra properties provide data that are vital for the correct > interpretation of the string in some cases. So why not put them all in the > same literal? > > ex1:myFeature ex2:name "script='Greek';language='Greek';'Αθήνα'"; > This is not a correct analogy. Because "POINT(0 0)" has no meaning on its own, while "Αθήνα" has some meaning on its own. You can add as much information as you want, but each RDF term has to be self-contained. You cannot rely on a set of triples to interpret an RDF term. At least this is what all formal treatments of RDF do! It is easy to see that this is not the most convenient way of expressing > the text. For example, it needs some processing before it can be used to > form a human readable text. Similarly, I don't think it is convenient to > put the specification of the CRS together with a coordinate sequence in the > same literal. Here is a list of reasons why I think it is inconvenient: > > 1. Most (all?) current GIS software takes coordinate strings and CRS > specifications separately. > > This is not (exactly) correct. GIS software implement standards. As Clemens also pointed out, different approaches are used in existing standards. ESRI shape files are essentially a collection of files, one of which describe the CRS (you cannot mix CRS in the same file). On the other hand, GML documents, GeoSPARQL documents, KML files have this information either hard-coded or defined at different granularities within their content (as Clemens also pointed out). > > 1. It should be possible to specify the CRS at the level of a data set > or a collection of geometries. > > We should be careful here. One could argue that we should be able to define the CRS at the level of a triple, at the level of an rdf:set, at the level of a named graph and so on (see for example the relevant research on provenance). I think that the simplest way to go is to define it at the finest level of granularity which is a triple. This allows all other cases to be covered as well. After all, RDF is not known for being laconic :D Anyhow, WKT and GML provides geometry collections like MULTIPOINT, MULTILINESTRING, MULTIPOLYGON etc. so you have specify a single CRS for a geometry collection. > > 1. It should be made easy for storage media to index the CRS. > > Since I have been heavily involved in the implementation of a geospatial RDF store (http://strabon.di.uoa.gr), I fully agree with this. Making storage easy however, is only possible when RDF terms are self-contained. However, when designing a vocabulary we should not be driver by implementations. > > 1. It should be possible to easily select data based on CRS in SPARQL > queries. > > GeoSPARQL and stSPARQL already provide a function for doing so. > > 1. Having multiple specifications of the same CRS for a single > geometry should be possible. > 2. Having multiple specifications of the same coordinate sequence for > a single geometry should be possible. > > I understand that is is desirable to use multiple serializations that use different CRS, but I do not understand what exactly you mean here. Can you elaborate on this? > > 1. There should not be a single authority for specifying CRS's > (especially if we want specifications to last until the sun goes nova). > > +1. This is why using just a URI is a good choice! > > 1. Next to CRS there are other geometry properties that could be > important, like level of detail. Do they need to be put in the same literal > too? That would make things even messier. > > Do you need this information to interpret the coordinates? > > 1. It should be possible to select only the CRS or only the > coordinates (for specific use cases). > > GeoSPARQL and stSPARQL offers a function for selecting the CRS of a geometry, and you can use a simple regex to get just the coordinates. As you say, this is for specific use cases, so why reinventing the wheel and not use existing standards? > > 1. Processing the coordinates requires removing the CRS specification > from the string, which is undesirable extra processing. > > On the contrary! When storing a spatial literal, you start by reading the CRS, create the appropriate precision model, and then use a WKT parser for example for the rest. If you have to keep in memory or secondary storage triples until all required parts are gathered things can get extremely costly very easily (for example performance will vary according to the ordering of the triples within a file!). > > 1. It should be possible to make statements about the CRS. > 2. It should be possible to dereference a CRS. > > I fully agree with you, but I think that this is not required in the context of this group. This should be the work of another group, since many vocabularies for representing CRS have to be unified, which is definitely not an easy task (and definitely out of scope for this group). > And I am quite sure this list is not exhaustive. Is it possible to have > an overview of advantages of concatenating the CRS and the coordinates? > The most important aspect of having the CRS inside a spatial literal, is that it relies on RDF and nothing more thus making it a good choice for the linked open data world. What you propose has to be encoded as an OWL 2 ontology with the appropriate cardinality restrictions, and then the users are forced to adopt this ontology. The linked data story so far (even though I do not entirely agree with this :) ) has shown that the only way to achieve adoption is by enforcing as few restrictions as possible. > Put in a more general perspective: If geographical data are going to exist > in the web of Linked Data, it is good to depart from historical constructs > if better solutions are on offer. Using a URI to identify a CRS is a very > good step in the rights direction. But why make it impossible to use that > same URI in an RDF triple? > Linked geospatial data are already here (one way or another) :) For example, a few months ago we published more than 100GB of linked geospatial data just by publishing two datasets (http://datahub.io/organization/teleios). I fully agree with you that we should not reinvent the wheel. That's why I am arguing in favor of using existing standards for this. So far, I have not been convinced that functionality-wise there is something missing from existing standards like GeoSPARQL. We have pointed out some small issues here and there (spatial aggregates, a transformation function) or some more important issues (temporal dimension of spatial data), but I am not convinced that there is something fundamentally wrong with it. If we follow the evolution of historical constructs, we can see that OGC started by separating serilalizations from CRS (however it defined precisely a mechanism for associating CRS and geometries), and then moved on and combined them (GML and GeoSPARQL). I agree that it would be nice to be able to define the CRS for a set of geographic features, but this would mean that we should define an OWL 2 ontology (similarly to schemas defined in GML) for this purpose, thus shooting our selves in the foot regarding the adoption of the proposed vocabulary.
Received on Friday, 3 January 2014 15:35:45 UTC