Re: Addresses have no easy identity was Re: Blank Nodes Re: Toward easier RDF: a proposal

You've got to admit that this conversation about addresses has been
valuable though, blank nodes were the original topic of the thread after
all. A lot of developers before Hugh have tried to do simple address
matching, myself being one when I was trying to create a food delivery
startup. It's been good to talk this through in the context of RDF, I
appreciate everybody's comments.

Anthony

On Tue, Dec 4, 2018 at 1:31 PM Dan Brickley <danbri@danbri.org> wrote:

>
> ("Details like URIs or bNodes seem to me rather down in the noise.")
>
> Thanks, Dave. This chimes with a lot of our experience at Google using
> Schema.org data (roughly RDF triples) from the Web, fwiw.
>
> Dan
>
>
> On Tue, 4 Dec 2018, 03:30 Dave Reynolds <dave.e.reynolds@gmail.com wrote:
>
>> I don't want to get embroiled in the main thread(s) but, just in case
>> anyone is *really* dealing with UK addresses rather than using them as
>> rhetorical examples, then ...
>>
>> On 03/12/2018 23:37, Anthony Moretti wrote:
>> > I see your point Hugh, especially in your case because for UK addresses
>> > consisting of only house number and postcode structural equality is
>> > sufficient for address equality. Decentralized will work very well in
>> > that case.
>>
>> Sadly that's a long way from being true. UK addresses within a postcode
>> my be identified by house name, house name + number, business name (with
>> no house name or number at all), any of those plus a secondary address
>> etc etc. Even when there's a house "number" sometimes its actually a
>> number range not a single number and there's considerable ambiguity on
>> how those ranges are expressed and what the "definitive" range for a
>> given property really is.
>>
>> Identity of UK addresses is simply not something you can express in OWL
>> or any logic close to it. You need an address reconciliation algorithm
>> to map your address to an maintained identifier set such as a UPRN or
>> UDPRN. The reconciliation process will have error rates that you will
>> need to manage and recover from, there's no closed, guaranteed algorithm.
>>
>> Once you have the UPRN or UDPRN or whatever you can create URI's or some
>> inverse functional property as you wish. Except that even then the
>> official identifier schemes like that aren't perfect and have ...
>> oddities ... in them that can still mess you up.
>>
>> Generating unique keys for resources based on hashing a few properties
>> is all very well in simple cases but, at least in my experience, real
>> world problems are nothing like that simple clean. You need serious
>> effort to create and maintain identifier schemes and to reconcile source
>> data against those schemes. Details like URIs or bNodes seem to me
>> rather down in the noise.
>>
>> Dave
>>
>> > On Mon, Dec 3, 2018 at 3:07 PM Nathan Rixham <nathan@webr3.org
>> > <mailto:nathan@webr3.org>> wrote:
>> >
>> >     Hugh, do you mean something like bnode.id <http://bnode.id> =
>> >     sha256(serialise(bnode))
>> >
>> >     On Mon, 3 Dec 2018, 22:58 Hugh Glaser <hugh@glasers.org
>> >     <mailto:hugh@glasers.org> wrote:
>> >
>> >         This is not directly about blank nodes, but is a reply to a
>> >         message in the thread.
>> >
>> >         I’m certainly agreeing that we should work towards common
>> >         understanding of Thing equality.
>> >         And addresses are a great place to start.
>> >         In order for equality to be defined, I think that means you
>> >         first need an idea of what an unambiguous address looks like.
>> >
>> >         Having an oracle that defines what an unambiguous Thing looks
>> >         like is one organisational structure, and it would be great if
>> >         schema.org <http://schema.org> could lead the way.
>> >         It particularly helps people who just want an off the shelf
>> >         solution, especially if they have no knowledge of the Thing
>> domain.
>> >
>> >         However I (and perhaps David Booth) am after something more
>> >         anarchic, that can function in a decentralised way (if I dare to
>> >         use that term! :-) )
>> >         For example, I might decide that I think that House Number and
>> >         PostCode is enough.
>> >         (UK people will know that this is a commonly-used way of
>> >         choosing an address, although it may well not be satisfactory
>> >         for some purposes, I’m sure.)
>> >         That may well be sufficient for me to interwork with datasets
>> >         from Companies House, the Land Registry and a bunch of other
>> >         UK-based organisations, plus many other datasets.
>> >
>> >         Having a simple standard way to create keys for such things
>> >         facilitates that, without any standardisation process and all
>> >         that entails in weaknesses and strengths of trying to get
>> >         agreement on what an unambiguous address might look like on a
>> >         world scale for all purposes.
>> >
>> >         Just generating a URI, without needing to make any service calls
>> >         (having found where they are and chosen the one you want and
>> >         compromised on it, etc.) or anything seems to me a way of making
>> >         all the interlinking so much more accessible for us all.
>> >         It is even future proof:- using such a URI means that if it is
>> >         about something new (UK postcodes change all the time :-(, and
>> >         there are more dead ones than live ones), the oracle doesn’t
>> >         tell me anything it didn’t have until I ask again.
>> >         In a key-generating world, my new shiny key will slowly align
>> >         with all the other key URIs as they get created.
>> >
>> >         So yeah, all strength to anyone who wants to take on the central
>> >         roles, but not at the expense of killing the anarchic solution,
>> >         please.
>> >
>> >         Cheers
>> >
>> >          > On 3 Dec 2018, at 22:10, Anthony Moretti
>> >         <anthony.moretti@gmail.com <mailto:anthony.moretti@gmail.com>>
>> >         wrote:
>> >          >
>> >          > Cheers for agreeing William. On the topic of incomplete blank
>> >         nodes Henry I'd give them another type, the partial address
>> >         example you give I'd give the type AddressComponent, or
>> >         something to that effect. I could be wrong, but it's not a valid
>> >         Address if it's a blank node and no other information in the
>> >         graph completes it.
>> >          >
>> >          > Anthony
>> >          >
>> >          > On Mon, Dec 3, 2018 at 1:56 PM William Waites
>> >         <wwaites@tardis.ed.ac.uk <mailto:wwaites@tardis.ed.ac.uk>>
>> wrote:
>> >          > > standards like schema:PostalAddress should possibly define
>> >         relevant
>> >          > > operations like equality checking too.
>> >          >
>> >          > Exactly.
>> >          >
>> >          >
>> >
>>
>>

Received on Tuesday, 4 December 2018 22:22:52 UTC