- From: Bart van Leeuwen <bart_van_leeuwen@netage.nl>
- Date: Fri, 9 May 2014 17:57:18 +0200
- To: Frans Knibbe | Geodan <frans.knibbe@geodan.nl>
- Cc: public-locadd@w3.org
- Message-ID: <OF22D19B54.381A4508-ONC1257CD3.0057815C-C1257CD3.0057A51E@netage.nl>
Hi Frans,
Nice work, some of the vocabs are not served as XML, but plain text.
As for inference rules I think virtuoso should be able to help you with
that.
Met Vriendelijke Groet / With Kind Regards
Bart van Leeuwen
##############################################################
# twitter: @semanticfire
# netage.nl
# http://netage.nl
# Enschedepad 76
# 1324 GJ Almere
# The Netherlands
# tel. +31(0)36-5347479
##############################################################
From: Frans Knibbe | Geodan <frans.knibbe@geodan.nl>
To: "public-locadd@w3.org Mailing list" <public-locadd@w3.org>
Date: 09-05-2014 17:41
Subject: A real world example: Dutch registry of buildings and
addresses
Hello list,
I have just finished (I think) a renewed publication of a dataset that
could serve as a nice real world example of application of the core
location vocabulary.
The dataset is the Dutch registry of buildings and addresses. It consists
of about 573 million triples. The URI of the dataset is
http://lod.geodan.nl/basisreg/bag/. This URI should be enough to enable
usage of the dataset as it should provide the data necessary for further
exploration. The dataset is bilingual: all terms in the main vocabulary
have explanations in Dutch and English.
I would be happy with any comments from this group on this data set, or
the associated vocabulary. I hope I have done some things right, but
probably there is some room for improvement.
Anyway, I would like to list some of the issues that I have encountered
that have something to do with the core location vocabulary. I would love
to know what you think about these!
About metadata: The dataset URI (http://lod.geodan.nl/basisreg/bag/)
resolves to dataset metadata. Because this dataset contains location data
(locations, addresses, geometries) I think some special metadata are
called for.
Issue 1: I feel that it is important to let it be known that a dataset is
of a geographical nature, i.e., a consumer could expect data about
locations in the data. As far as I know, there is no well established way
of making such a statement. For this dataset, I specified
<http://www.w3.org/ns/locn> as one of the main vocabularies used (using
void:vocabulary) and I specified the spatial extent of the data (using
dcterms:spatial). WDYT?
Issue 2: Spatial Extent: The spatial extent of the dataset is specified by
both a geometry and a dbpedia reference to the Netherlands. I think that
is sufficient.
Issue 3: CRS: I can think of no way to specify the CRS used in the data.
An extension of LOCN to enable this would be welcome, I think.
Issue 4: Level of Detail / Spatial resolution: This would be applicable to
the subsets (which are named graphs) within the dataset. I think that
information could be useful to consumers, but I can not think of a way to
express this.
About geometry:
Issue 5: The geometries in the source data use the Dutch national CRS. I
have transformed them to WGS84 lon/lat for several reasons:
a) The triple store used (Virtuoso) does not support other CRSs yet
b) I really do not like WKT literals with prefixed CRS URIs, as mandated
by GeoSPARQL
c) the CRS is more common, especially internationally it will be more
useful.
The only drawback I can think of is that this transformation would not do
with very detailed geometries. Because these data are European, it would
be better to use ETRS89. The current standard is far more useful for
American data than for data from other continents!
Issue 6: The publication is powered by Virtuoso 7.1. This means there are
capabilities for using topological functions in SPARQL. The following
example asks the name of the town in which a point (which could be your
current location) is located, using the function st_within(). The SPARQL
endpoint is http://lod.geodan.nl/sparql, as specified in the metadata.
prefix bag: <http://lod.geodan.nl/vocab/bag#>
select ?name
from <http://lod.geodan.nl/basisreg/bag/woonplaats/>
where {
?wpmut a bag:Woonplaatsmutatie .
?wpmut bag:lastKnown "true"^^xsd:boolean .
?wpmut bag:geometrie ?geom .
?wpmut bag:naam ?name
filter (bif:st_within(?geom, bif:st_point (6.56,53.21)))
}
It is not perfect yet: topological functions operate on bounding boxes of
geometries, not the geometries themselves. Also, it is not yet possible to
use GeoSPARQL expressions. According to people at Openlink, these issues
will be resolved soon, in a next version of Virtuoso.
About application of LOCN:
Issue 7: If you take a look at the vocabulary I made for this dataset (
http://lod.geodan.nl/vocab/bag or http://lod.geodan.nl/vocab/bag.ttl), you
can see that I tried to apply LOCN. Mostly, classes are defined as being
subclasses of LOCN classes and properties are defined as being
subproperties of LOCN properties. But without special measures, one can
not use LOCN terms in SPARQL queries. The following example returns
nothing because I have not created explicit triples for locn classes, and
neither have I made inference rules. So I wonder if it is really
worthwhile to use LOCN, or to use it in the way that I have.
prefix locn: <http://www.w3.org/ns/locn#>
select *
from <http://lod.geodan.nl/basisreg/bag/ligplaats/>
where {
?s a locn:Location .
}
Or to put in different words: what is the added value of LOCN in this
case? And how could that added value be increased?
Regards,
Frans
Frans Knibbe
Geodan
President Kennedylaan 1
1079 MB Amsterdam (NL)
T +31 (0)20 - 5711 347
E frans.knibbe@geodan.nl
www.geodan.nl | disclaimer
Received on Friday, 9 May 2014 15:57:51 UTC