Re: Addresses have no easy identity was Re: Blank Nodes Re: Toward easier RDF: a proposal

Everybody who is interested in representing address should read at least 
this -

https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/

TomP

On 12/4/2018 5:22 PM, Anthony Moretti wrote:
> You've got to admit that this conversation about addresses has been 
> valuable though, blank nodes were the original topic of the thread after 
> all. A lot of developers before Hugh have tried to do simple address 
> matching, myself being one when I was trying to create a food delivery 
> startup. It's been good to talk this through in the context of RDF, I 
> appreciate everybody's comments.
> 
> Anthony
> 
> On Tue, Dec 4, 2018 at 1:31 PM Dan Brickley <danbri@danbri.org 
> <mailto:danbri@danbri.org>> wrote:
> 
> 
>     ("Details like URIs or bNodes seem to me rather down in the noise.")
> 
>     Thanks, Dave. This chimes with a lot of our experience at Google
>     using Schema.org data (roughly RDF triples) from the Web, fwiw.
> 
>     Dan
> 
> 
>     On Tue, 4 Dec 2018, 03:30 Dave Reynolds <dave.e.reynolds@gmail.com
>     <mailto:dave.e.reynolds@gmail.com> wrote:
> 
>         I don't want to get embroiled in the main thread(s) but, just in
>         case
>         anyone is *really* dealing with UK addresses rather than using
>         them as
>         rhetorical examples, then ...
> 
>         On 03/12/2018 23:37, Anthony Moretti wrote:
>          > I see your point Hugh, especially in your case because for UK
>         addresses
>          > consisting of only house number and postcode structural
>         equality is
>          > sufficient for address equality. Decentralized will work very
>         well in
>          > that case.
> 
>         Sadly that's a long way from being true. UK addresses within a
>         postcode
>         my be identified by house name, house name + number, business
>         name (with
>         no house name or number at all), any of those plus a secondary
>         address
>         etc etc. Even when there's a house "number" sometimes its
>         actually a
>         number range not a single number and there's considerable
>         ambiguity on
>         how those ranges are expressed and what the "definitive" range
>         for a
>         given property really is.
> 
>         Identity of UK addresses is simply not something you can express
>         in OWL
>         or any logic close to it. You need an address reconciliation
>         algorithm
>         to map your address to an maintained identifier set such as a
>         UPRN or
>         UDPRN. The reconciliation process will have error rates that you
>         will
>         need to manage and recover from, there's no closed, guaranteed
>         algorithm.
> 
>         Once you have the UPRN or UDPRN or whatever you can create URI's
>         or some
>         inverse functional property as you wish. Except that even then the
>         official identifier schemes like that aren't perfect and have ...
>         oddities ... in them that can still mess you up.
> 
>         Generating unique keys for resources based on hashing a few
>         properties
>         is all very well in simple cases but, at least in my experience,
>         real
>         world problems are nothing like that simple clean. You need serious
>         effort to create and maintain identifier schemes and to
>         reconcile source
>         data against those schemes. Details like URIs or bNodes seem to me
>         rather down in the noise.
> 
>         Dave
> 
>          > On Mon, Dec 3, 2018 at 3:07 PM Nathan Rixham
>         <nathan@webr3.org <mailto:nathan@webr3.org>
>          > <mailto:nathan@webr3.org <mailto:nathan@webr3.org>>> wrote:
>          >
>          >     Hugh, do you mean something like bnode.id
>         <http://bnode.id> <http://bnode.id> =
>          >     sha256(serialise(bnode))
>          >
>          >     On Mon, 3 Dec 2018, 22:58 Hugh Glaser <hugh@glasers.org
>         <mailto:hugh@glasers.org>
>          >     <mailto:hugh@glasers.org <mailto:hugh@glasers.org>> wrote:
>          >
>          >         This is not directly about blank nodes, but is a
>         reply to a
>          >         message in the thread.
>          >
>          >         I’m certainly agreeing that we should work towards common
>          >         understanding of Thing equality.
>          >         And addresses are a great place to start.
>          >         In order for equality to be defined, I think that
>         means you
>          >         first need an idea of what an unambiguous address
>         looks like.
>          >
>          >         Having an oracle that defines what an unambiguous
>         Thing looks
>          >         like is one organisational structure, and it would be
>         great if
>          > schema.org <http://schema.org> <http://schema.org> could lead
>         the way.
>          >         It particularly helps people who just want an off the
>         shelf
>          >         solution, especially if they have no knowledge of the
>         Thing domain.
>          >
>          >         However I (and perhaps David Booth) am after
>         something more
>          >         anarchic, that can function in a decentralised way
>         (if I dare to
>          >         use that term! :-) )
>          >         For example, I might decide that I think that House
>         Number and
>          >         PostCode is enough.
>          >         (UK people will know that this is a commonly-used way of
>          >         choosing an address, although it may well not be
>         satisfactory
>          >         for some purposes, I’m sure.)
>          >         That may well be sufficient for me to interwork with
>         datasets
>          >         from Companies House, the Land Registry and a bunch
>         of other
>          >         UK-based organisations, plus many other datasets.
>          >
>          >         Having a simple standard way to create keys for such
>         things
>          >         facilitates that, without any standardisation process
>         and all
>          >         that entails in weaknesses and strengths of trying to get
>          >         agreement on what an unambiguous address might look
>         like on a
>          >         world scale for all purposes.
>          >
>          >         Just generating a URI, without needing to make any
>         service calls
>          >         (having found where they are and chosen the one you
>         want and
>          >         compromised on it, etc.) or anything seems to me a
>         way of making
>          >         all the interlinking so much more accessible for us all.
>          >         It is even future proof:- using such a URI means that
>         if it is
>          >         about something new (UK postcodes change all the time
>         :-(, and
>          >         there are more dead ones than live ones), the oracle
>         doesn’t
>          >         tell me anything it didn’t have until I ask again.
>          >         In a key-generating world, my new shiny key will
>         slowly align
>          >         with all the other key URIs as they get created.
>          >
>          >         So yeah, all strength to anyone who wants to take on
>         the central
>          >         roles, but not at the expense of killing the anarchic
>         solution,
>          >         please.
>          >
>          >         Cheers
>          >
>          >          > On 3 Dec 2018, at 22:10, Anthony Moretti
>          >         <anthony.moretti@gmail.com
>         <mailto:anthony.moretti@gmail.com>
>         <mailto:anthony.moretti@gmail.com
>         <mailto:anthony.moretti@gmail.com>>>
>          >         wrote:
>          >          >
>          >          > Cheers for agreeing William. On the topic of
>         incomplete blank
>          >         nodes Henry I'd give them another type, the partial
>         address
>          >         example you give I'd give the type AddressComponent, or
>          >         something to that effect. I could be wrong, but it's
>         not a valid
>          >         Address if it's a blank node and no other information
>         in the
>          >         graph completes it.
>          >          >
>          >          > Anthony
>          >          >
>          >          > On Mon, Dec 3, 2018 at 1:56 PM William Waites
>          >         <wwaites@tardis.ed.ac.uk
>         <mailto:wwaites@tardis.ed.ac.uk> <mailto:wwaites@tardis.ed.ac.uk
>         <mailto:wwaites@tardis.ed.ac.uk>>> wrote:
>          >          > > standards like schema:PostalAddress should
>         possibly define
>          >         relevant
>          >          > > operations like equality checking too.
>          >          >
>          >          > Exactly.
>          >          >
>          >          >
>          >
> 

Received on Tuesday, 4 December 2018 23:19:55 UTC