Choosing the right format & vocabulary for spatial data

All.

We Best Practice editors are trying to figure out how we should write the
Best Practice(s) relating to data format and vocabulary ... currently
that's SDWBP 7 [1].

As agreed at our last BP sub-group call, we're restructuring the SDW BP doc
based on the DWBP document [3].

Data formats are described in DWBP 13 [4] and data vocabularies in DWBP 15
[5]. Both of these best practices provide generic advice. We want to
provide some actionable advice for how to chose the right data format
and/or vocabulary for spatial data.

We also need to make this guidance prescriptive ... so we need to tell
people what their choices are based around today's options.

We wonder if it's possible to make a single 'decision tree' that readers to
could use to help them make the choice?

The kind of questions that might be used in such a decision tree include:
* does the data format support Web linking (see RFC5988 [6]) ... this is a
_MUST_
* what dimensionality (of SpatialThings) is needed?
* what types of geometry are needed (e.g. polygons w holes, relative
positioning)?
* do you need a CRS other than WGS84?
* what technical environments / tool chains do your target community
predominately use?

... etc.

Choosing the vocabulary (e.g. RDF vocab / OWL ontology) only makes sense if
an RDF-serialisation (RDF-XML, N-Triples, JSON-LD, TTL etc.) is used

... but JSON-LD gives the opportunity to blur the boundaries with other
formats and _still_ use RDF.

Similarly, CSV as "tabular data" can be converted to JSON [7] or RDF [8]
...

I'm not sure what the 'decision tree' should look like- or even if this is
the correct approach.

Furthermore, should we be referencing the emerging GeoSPARQL 1.1 ontology
being developed by Josh [9] ... as this aims to add more [geometry]
serialisations that are the vocabulary more widely usable.

I'm also mindful that often people use a hybrid approach mixing best of
breed technologies for RDF and spatial ...

***Your thoughts please***

As an implementer, what questions are important to you when choosing the
format? How would you structure this best practice? Is this even achievable?

Many thanks, Jeremy


[1]: http://w3c.github.io/sdw/bp/#describe-geometry
[2]: http://www.w3.org/2016/07/13-sdwbp-minutes
[3]: http://w3c.github.io/dwbp/publishing-snapshots/CR-dwbp-20160706/
[4]:
http://w3c.github.io/dwbp/publishing-snapshots/CR-dwbp-20160706/#MachineReadableStandardizedFormat

[5]:
http://w3c.github.io/dwbp/publishing-snapshots/CR-dwbp-20160706/#ReuseVocabularies

[6]: https://tools.ietf.org/html/rfc5988
[7]: http://www.w3.org/TR/csv2json/
[8]: http://www.w3.org/TR/csv2rdf/
[9]: https://www.w3.org/2015/spatial/wiki/Further_development_of_GeoSPARQL
<http://w3c.github.io/sdw/bp/#describe-geometry>
<http://w3c.github.io/sdw/bp/#describe-geometry>
<http://w3c.github.io/sdw/bp/#describe-geometry>

Received on Monday, 25 July 2016 11:15:18 UTC