W3C home > Mailing lists > Public > semantic-web@w3.org > December 2018

Addresses have no easy identity was Re: Blank Nodes Re: Toward easier RDF: a proposal

From: Dave Reynolds <dave.e.reynolds@gmail.com>
Date: Tue, 4 Dec 2018 11:24:55 +0000
To: semantic-web@w3.org
Message-ID: <b8b7e97b-e275-ddd7-3765-3ee483f8a427@gmail.com>
I don't want to get embroiled in the main thread(s) but, just in case 
anyone is *really* dealing with UK addresses rather than using them as 
rhetorical examples, then ...

On 03/12/2018 23:37, Anthony Moretti wrote:
> I see your point Hugh, especially in your case because for UK addresses 
> consisting of only house number and postcode structural equality is 
> sufficient for address equality. Decentralized will work very well in 
> that case.

Sadly that's a long way from being true. UK addresses within a postcode 
my be identified by house name, house name + number, business name (with 
no house name or number at all), any of those plus a secondary address 
etc etc. Even when there's a house "number" sometimes its actually a 
number range not a single number and there's considerable ambiguity on 
how those ranges are expressed and what the "definitive" range for a 
given property really is.

Identity of UK addresses is simply not something you can express in OWL 
or any logic close to it. You need an address reconciliation algorithm 
to map your address to an maintained identifier set such as a UPRN or 
UDPRN. The reconciliation process will have error rates that you will 
need to manage and recover from, there's no closed, guaranteed algorithm.

Once you have the UPRN or UDPRN or whatever you can create URI's or some 
inverse functional property as you wish. Except that even then the 
official identifier schemes like that aren't perfect and have ... 
oddities ... in them that can still mess you up.

Generating unique keys for resources based on hashing a few properties 
is all very well in simple cases but, at least in my experience, real 
world problems are nothing like that simple clean. You need serious 
effort to create and maintain identifier schemes and to reconcile source 
data against those schemes. Details like URIs or bNodes seem to me 
rather down in the noise.

Dave

> On Mon, Dec 3, 2018 at 3:07 PM Nathan Rixham <nathan@webr3.org 
> <mailto:nathan@webr3.org>> wrote:
> 
>     Hugh, do you mean something like bnode.id <http://bnode.id> =
>     sha256(serialise(bnode))
> 
>     On Mon, 3 Dec 2018, 22:58 Hugh Glaser <hugh@glasers.org
>     <mailto:hugh@glasers.org> wrote:
> 
>         This is not directly about blank nodes, but is a reply to a
>         message in the thread.
> 
>         I’m certainly agreeing that we should work towards common
>         understanding of Thing equality.
>         And addresses are a great place to start.
>         In order for equality to be defined, I think that means you
>         first need an idea of what an unambiguous address looks like.
> 
>         Having an oracle that defines what an unambiguous Thing looks
>         like is one organisational structure, and it would be great if
>         schema.org <http://schema.org> could lead the way.
>         It particularly helps people who just want an off the shelf
>         solution, especially if they have no knowledge of the Thing domain.
> 
>         However I (and perhaps David Booth) am after something more
>         anarchic, that can function in a decentralised way (if I dare to
>         use that term! :-) )
>         For example, I might decide that I think that House Number and
>         PostCode is enough.
>         (UK people will know that this is a commonly-used way of
>         choosing an address, although it may well not be satisfactory
>         for some purposes, I’m sure.)
>         That may well be sufficient for me to interwork with datasets
>         from Companies House, the Land Registry and a bunch of other
>         UK-based organisations, plus many other datasets.
> 
>         Having a simple standard way to create keys for such things
>         facilitates that, without any standardisation process and all
>         that entails in weaknesses and strengths of trying to get
>         agreement on what an unambiguous address might look like on a
>         world scale for all purposes.
> 
>         Just generating a URI, without needing to make any service calls
>         (having found where they are and chosen the one you want and
>         compromised on it, etc.) or anything seems to me a way of making
>         all the interlinking so much more accessible for us all.
>         It is even future proof:- using such a URI means that if it is
>         about something new (UK postcodes change all the time :-(, and
>         there are more dead ones than live ones), the oracle doesn’t
>         tell me anything it didn’t have until I ask again.
>         In a key-generating world, my new shiny key will slowly align
>         with all the other key URIs as they get created.
> 
>         So yeah, all strength to anyone who wants to take on the central
>         roles, but not at the expense of killing the anarchic solution,
>         please.
> 
>         Cheers
> 
>          > On 3 Dec 2018, at 22:10, Anthony Moretti
>         <anthony.moretti@gmail.com <mailto:anthony.moretti@gmail.com>>
>         wrote:
>          >
>          > Cheers for agreeing William. On the topic of incomplete blank
>         nodes Henry I'd give them another type, the partial address
>         example you give I'd give the type AddressComponent, or
>         something to that effect. I could be wrong, but it's not a valid
>         Address if it's a blank node and no other information in the
>         graph completes it.
>          >
>          > Anthony
>          >
>          > On Mon, Dec 3, 2018 at 1:56 PM William Waites
>         <wwaites@tardis.ed.ac.uk <mailto:wwaites@tardis.ed.ac.uk>> wrote:
>          > > standards like schema:PostalAddress should possibly define
>         relevant
>          > > operations like equality checking too.
>          >
>          > Exactly.
>          >
>          >
> 
Received on Tuesday, 4 December 2018 11:25:21 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 4 December 2018 11:25:21 UTC