Re: A real world example: Dutch registry of buildings and addresses

On 2014-05-09 17:57, Bart van Leeuwen wrote:
> Hi Frans,
>
> Nice work, some of the vocabs are not served as XML, but plain text.
At the moment, there is a version in RDF/XML 
(http://lod.geodan.nl/vocab/bag) and a version in Turtle 
(http://lod.geodan.nl/vocab/bag.ttl). I want to add a HTML version (I am 
looking for the best way to automatically generate a nice HTML file from 
a vocabulary) and I also should get the content negotiation working. 
Thanks for reminding me!
>
> As for inference rules I think virtuoso should be able to help you 
> with that.
I tried the Virtuoso inference rules on another dataset. They work, but 
the thing I don't get is that the rule is not used in a SPARQL query 
unless it is specified with an input:inference pragma. But how is a 
passer-by supposed to know that there is an inference rule that can be 
applied and the name of that rule and the method by which it should be 
invoked? But perhaps I am missing something...
>
> Met Vriendelijke Groet / With Kind Regards
> Bart van Leeuwen
>
> ##############################################################
> # twitter: @semanticfire
> # netage.nl
> # http://netage.nl <http://netage.nl/>
> # Enschedepad 76
> # 1324 GJ Almere
> # The Netherlands
> # tel. +31(0)36-5347479
> ##############################################################
>
>
>
> From: Frans Knibbe | Geodan <frans.knibbe@geodan.nl>
> To: "public-locadd@w3.org Mailing list" <public-locadd@w3.org>
> Date: 09-05-2014 17:41
> Subject: A real world example: Dutch registry of buildings and addresses
> ------------------------------------------------------------------------
>
>
>
> Hello list,
>
> I have just finished (I think) a renewed publication of a dataset that 
> could serve as a nice real world example of application of the core 
> location vocabulary.
> The dataset is the Dutch registry of buildings and addresses. It 
> consists of about 573 million triples. The URI of the dataset is 
> _http://lod.geodan.nl/basisreg/bag/_. This URI should be enough to 
> enable usage of the dataset as it should provide the data necessary 
> for further exploration. The dataset is bilingual: all terms in the 
> main vocabulary have explanations in Dutch and English.
>
> I would be happy with any comments from this group on this data set, 
> or the associated vocabulary. I hope I have done some things right, 
> but probably there is some room for improvement.
>
> Anyway, I would like to list some of the issues that I have 
> encountered that have something to do with the core location 
> vocabulary. I would love to know what you think about these!
>
> About *metadata*: The dataset URI 
> (_http://lod.geodan.nl/basisreg/bag/_) resolves to dataset metadata. 
> Because this dataset contains location data (locations, addresses, 
> geometries) I think some special metadata are called for.
> _
> Issue 1:_  I feel that it is important to let it be known that a 
> dataset is of a geographical nature, i.e., a consumer could expect 
> data about locations in the data. As far as I know, there is no well 
> established way of making such a statement. For this dataset, I 
> specified _<http://www.w3.org/ns/locn>_ <http://www.w3.org/ns/locn>as 
> one of the main vocabularies used (using void:vocabulary) and I 
> specified the spatial extent of the data (using dcterms:spatial). WDYT?
> _
> Issue 2:_ Spatial Extent: The spatial extent of the dataset is 
> specified by both a geometry and a dbpedia reference to the 
> Netherlands. I think that is sufficient.
> _
> Issue 3:_ CRS: I can think of no way to specify the CRS used in the 
> data. An extension of LOCN to enable this would be welcome, I think.
> _
> Issue 4:_ Level of Detail / Spatial resolution: This would be 
> applicable to the subsets (which are named graphs) within the dataset. 
> I think that information could be useful to consumers, but I can not 
> think of a way to express this.
>
> About *geometry*:
> _
> Issue 5:_ The geometries in the source data use the Dutch national 
> CRS. I have transformed them to WGS84 lon/lat for several reasons:
> a) The triple store used (Virtuoso) does not support other CRSs yet
> b) I really do not like WKT literals with prefixed CRS URIs, as 
> mandated by GeoSPARQL
> c) the CRS is more common, especially internationally it will be more 
> useful.
>
> The only drawback I can think of is that this transformation would not 
> do with very detailed geometries. Because these data are European, it 
> would be better to use ETRS89. The current standard is far more useful 
> for American data than for data from other continents!
> _
> Issue 6:_ The publication is powered by Virtuoso 7.1. This means there 
> are capabilities for using topological functions in SPARQL. The 
> following example asks the name of the town in which a point (which 
> could be your current location) is located, using the function 
> st_within(). The SPARQL endpoint is _http://lod.geodan.nl/sparql_, as 
> specified in the metadata.
>
> prefix bag: _<http://lod.geodan.nl/vocab/bag#>_ 
> <http://lod.geodan.nl/vocab/bag#>
> select ?name
> from _<http://lod.geodan.nl/basisreg/bag/woonplaats/>_ 
> <http://lod.geodan.nl/basisreg/bag/woonplaats/>
> where {
>    ?wpmut a bag:Woonplaatsmutatie .
>    ?wpmut bag:lastKnown "true"^^xsd:boolean .
>    ?wpmut bag:geometrie ?geom .
>    ?wpmut bag:naam ?name
>    filter (bif:st_within(?geom, bif:st_point (6.56,53.21)))
> }
>
> It is not perfect yet: topological functions operate on bounding boxes 
> of geometries, not the geometries themselves. Also, it is not yet 
> possible to use GeoSPARQL expressions. According to people at 
> Openlink, these issues will be resolved soon, in a next version of 
> Virtuoso.
>
> About application of *LOCN*:
> _
> Issue 7:_ If you take a look at the vocabulary I made for this dataset 
> (_http://lod.geodan.nl/vocab/bag_or_http://lod.geodan.nl/vocab/bag.ttl_ <http://lod.geodan.nl/vocab/bag.ttl>), 
> you can see that I tried to apply LOCN. Mostly, classes are defined as 
> being subclasses of LOCN classes and properties are defined as being 
> subproperties of LOCN properties. But without special measures, one 
> can not use LOCN terms in SPARQL queries. The following example 
> returns nothing because I have not created explicit triples for locn 
> classes, and neither have I made _inference rules_ 
> <http://docs.openlinksw.com/virtuoso/rdfsparqlrule.html>. So  I wonder 
> if it is really worthwhile to use LOCN, or to use it in the way that I 
> have.
>
> prefix locn: _<http://www.w3.org/ns/locn#>_ <http://www.w3.org/ns/locn#>
> select *
> from _<http://lod.geodan.nl/basisreg/bag/ligplaats/>_ 
> <http://lod.geodan.nl/basisreg/bag/ligplaats/>
> where {
>    ?s a locn:Location .
> }
>
> Or to put in different words: what is the added value of LOCN in this 
> case? And how could that added value be increased?
>
>
> Regards,
> Frans
>
>
> ------------------------------------------------------------------------
> Frans Knibbe
> Geodan
> President Kennedylaan 1
> 1079 MB Amsterdam (NL)
>
> T +31 (0)20 - 5711 347
> E _frans.knibbe@geodan.nl_ <mailto:frans.knibbe@geodan.nl>_
> __www.geodan.nl_ <http://www.geodan.nl/>| _disclaimer_ 
> <http://www.geodan.nl/disclaimer>
> ------------------------------------------------------------------------
>


------------------------------------------------------------------------
Frans Knibbe
Geodan
President Kennedylaan 1
1079 MB Amsterdam (NL)

T +31 (0)20 - 5711 347
E frans.knibbe@geodan.nl
www.geodan.nl <http://www.geodan.nl> | disclaimer 
<http://www.geodan.nl/disclaimer>
------------------------------------------------------------------------

Received on Tuesday, 13 May 2014 08:04:28 UTC