W3C home > Mailing lists > Public > semantic-web@w3.org > December 2018

Re: Addresses have no easy identity was Re: Blank Nodes Re: Toward easier RDF: a proposal

From: Anthony Moretti <anthony.moretti@gmail.com>
Date: Tue, 4 Dec 2018 15:41:09 -0800
Message-ID: <CACusdfTSmHaHKOEpWCaKjfEHJwhVzHtrFOVMek_5i5cuw54gfA@mail.gmail.com>
To: tpassin@tompassin.net
Cc: Semantic Web <semantic-web@w3.org>
A good read Thomas. And here we are all trying to roll our own algorithms.

Anthony

On Tue, Dec 4, 2018 at 3:24 PM Thomas Passin <tpassin@tompassin.net> wrote:

> Everybody who is interested in representing address should read at least
> this -
>
> https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/
>
> TomP
>
> On 12/4/2018 5:22 PM, Anthony Moretti wrote:
> > You've got to admit that this conversation about addresses has been
> > valuable though, blank nodes were the original topic of the thread after
> > all. A lot of developers before Hugh have tried to do simple address
> > matching, myself being one when I was trying to create a food delivery
> > startup. It's been good to talk this through in the context of RDF, I
> > appreciate everybody's comments.
> >
> > Anthony
> >
> > On Tue, Dec 4, 2018 at 1:31 PM Dan Brickley <danbri@danbri.org
> > <mailto:danbri@danbri.org>> wrote:
> >
> >
> >     ("Details like URIs or bNodes seem to me rather down in the noise.")
> >
> >     Thanks, Dave. This chimes with a lot of our experience at Google
> >     using Schema.org data (roughly RDF triples) from the Web, fwiw.
> >
> >     Dan
> >
> >
> >     On Tue, 4 Dec 2018, 03:30 Dave Reynolds <dave.e.reynolds@gmail.com
> >     <mailto:dave.e.reynolds@gmail.com> wrote:
> >
> >         I don't want to get embroiled in the main thread(s) but, just in
> >         case
> >         anyone is *really* dealing with UK addresses rather than using
> >         them as
> >         rhetorical examples, then ...
> >
> >         On 03/12/2018 23:37, Anthony Moretti wrote:
> >          > I see your point Hugh, especially in your case because for UK
> >         addresses
> >          > consisting of only house number and postcode structural
> >         equality is
> >          > sufficient for address equality. Decentralized will work very
> >         well in
> >          > that case.
> >
> >         Sadly that's a long way from being true. UK addresses within a
> >         postcode
> >         my be identified by house name, house name + number, business
> >         name (with
> >         no house name or number at all), any of those plus a secondary
> >         address
> >         etc etc. Even when there's a house "number" sometimes its
> >         actually a
> >         number range not a single number and there's considerable
> >         ambiguity on
> >         how those ranges are expressed and what the "definitive" range
> >         for a
> >         given property really is.
> >
> >         Identity of UK addresses is simply not something you can express
> >         in OWL
> >         or any logic close to it. You need an address reconciliation
> >         algorithm
> >         to map your address to an maintained identifier set such as a
> >         UPRN or
> >         UDPRN. The reconciliation process will have error rates that you
> >         will
> >         need to manage and recover from, there's no closed, guaranteed
> >         algorithm.
> >
> >         Once you have the UPRN or UDPRN or whatever you can create URI's
> >         or some
> >         inverse functional property as you wish. Except that even then
> the
> >         official identifier schemes like that aren't perfect and have ...
> >         oddities ... in them that can still mess you up.
> >
> >         Generating unique keys for resources based on hashing a few
> >         properties
> >         is all very well in simple cases but, at least in my experience,
> >         real
> >         world problems are nothing like that simple clean. You need
> serious
> >         effort to create and maintain identifier schemes and to
> >         reconcile source
> >         data against those schemes. Details like URIs or bNodes seem to
> me
> >         rather down in the noise.
> >
> >         Dave
> >
> >          > On Mon, Dec 3, 2018 at 3:07 PM Nathan Rixham
> >         <nathan@webr3.org <mailto:nathan@webr3.org>
> >          > <mailto:nathan@webr3.org <mailto:nathan@webr3.org>>> wrote:
> >          >
> >          >     Hugh, do you mean something like bnode.id
> >         <http://bnode.id> <http://bnode.id> =
> >          >     sha256(serialise(bnode))
> >          >
> >          >     On Mon, 3 Dec 2018, 22:58 Hugh Glaser <hugh@glasers.org
> >         <mailto:hugh@glasers.org>
> >          >     <mailto:hugh@glasers.org <mailto:hugh@glasers.org>>
> wrote:
> >          >
> >          >         This is not directly about blank nodes, but is a
> >         reply to a
> >          >         message in the thread.
> >          >
> >          >         I’m certainly agreeing that we should work towards
> common
> >          >         understanding of Thing equality.
> >          >         And addresses are a great place to start.
> >          >         In order for equality to be defined, I think that
> >         means you
> >          >         first need an idea of what an unambiguous address
> >         looks like.
> >          >
> >          >         Having an oracle that defines what an unambiguous
> >         Thing looks
> >          >         like is one organisational structure, and it would be
> >         great if
> >          > schema.org <http://schema.org> <http://schema.org> could lead
> >         the way.
> >          >         It particularly helps people who just want an off the
> >         shelf
> >          >         solution, especially if they have no knowledge of the
> >         Thing domain.
> >          >
> >          >         However I (and perhaps David Booth) am after
> >         something more
> >          >         anarchic, that can function in a decentralised way
> >         (if I dare to
> >          >         use that term! :-) )
> >          >         For example, I might decide that I think that House
> >         Number and
> >          >         PostCode is enough.
> >          >         (UK people will know that this is a commonly-used way
> of
> >          >         choosing an address, although it may well not be
> >         satisfactory
> >          >         for some purposes, I’m sure.)
> >          >         That may well be sufficient for me to interwork with
> >         datasets
> >          >         from Companies House, the Land Registry and a bunch
> >         of other
> >          >         UK-based organisations, plus many other datasets.
> >          >
> >          >         Having a simple standard way to create keys for such
> >         things
> >          >         facilitates that, without any standardisation process
> >         and all
> >          >         that entails in weaknesses and strengths of trying to
> get
> >          >         agreement on what an unambiguous address might look
> >         like on a
> >          >         world scale for all purposes.
> >          >
> >          >         Just generating a URI, without needing to make any
> >         service calls
> >          >         (having found where they are and chosen the one you
> >         want and
> >          >         compromised on it, etc.) or anything seems to me a
> >         way of making
> >          >         all the interlinking so much more accessible for us
> all.
> >          >         It is even future proof:- using such a URI means that
> >         if it is
> >          >         about something new (UK postcodes change all the time
> >         :-(, and
> >          >         there are more dead ones than live ones), the oracle
> >         doesn’t
> >          >         tell me anything it didn’t have until I ask again.
> >          >         In a key-generating world, my new shiny key will
> >         slowly align
> >          >         with all the other key URIs as they get created.
> >          >
> >          >         So yeah, all strength to anyone who wants to take on
> >         the central
> >          >         roles, but not at the expense of killing the anarchic
> >         solution,
> >          >         please.
> >          >
> >          >         Cheers
> >          >
> >          >          > On 3 Dec 2018, at 22:10, Anthony Moretti
> >          >         <anthony.moretti@gmail.com
> >         <mailto:anthony.moretti@gmail.com>
> >         <mailto:anthony.moretti@gmail.com
> >         <mailto:anthony.moretti@gmail.com>>>
> >          >         wrote:
> >          >          >
> >          >          > Cheers for agreeing William. On the topic of
> >         incomplete blank
> >          >         nodes Henry I'd give them another type, the partial
> >         address
> >          >         example you give I'd give the type AddressComponent,
> or
> >          >         something to that effect. I could be wrong, but it's
> >         not a valid
> >          >         Address if it's a blank node and no other information
> >         in the
> >          >         graph completes it.
> >          >          >
> >          >          > Anthony
> >          >          >
> >          >          > On Mon, Dec 3, 2018 at 1:56 PM William Waites
> >          >         <wwaites@tardis.ed.ac.uk
> >         <mailto:wwaites@tardis.ed.ac.uk> <mailto:wwaites@tardis.ed.ac.uk
> >         <mailto:wwaites@tardis.ed.ac.uk>>> wrote:
> >          >          > > standards like schema:PostalAddress should
> >         possibly define
> >          >         relevant
> >          >          > > operations like equality checking too.
> >          >          >
> >          >          > Exactly.
> >          >          >
> >          >          >
> >          >
> >
>
>
>
Received on Tuesday, 4 December 2018 23:41:44 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:57 UTC