- From: Frans Knibbe <frans.knibbe@geodan.nl>
- Date: Thu, 6 Dec 2018 15:01:30 +0100
- To: semantic-web@w3.org
- Message-ID: <CAFVDz43CMVZ1+LAz=Nuz8_xppbNoxuU5AfOJqpUC-Wdrg_kbUg@mail.gmail.com>
There has been some work done on a general way to express address data on
the web. As a part of the set of EU core vocabularies
<https://ec.europa.eu/isa2/solutions/core-vocabularies_en>, the Location
Core Vocabulary <https://www.w3.org/ns/locn> (locn for short) was developed
and published. It has a related community group, the Locations and
addresses community group <https://www.w3.org/community/locadd/> (locadd
for short). Modelling addresses is hard, so further discussions and
contributions are welcome.
Regards,
Frans
Op wo 5 dec. 2018 om 03:02 schreef Thomas Passin <tpassin@tompassin.net>:
> When I think of modeling addresses, after reading some of these posts
> and links (and not having had to do this for a living), I would say the
> simplest model would be this, which seems pretty close to what Joshua said:
>
> An address
> can be represented by one or more representations;
> denotes a (physical) location # maybe one or more?
> may have one or more textual aliases.
>
> A representation # has one specific syntactical form
> may have a grammar specification # e.g. B-N.
>
> A "representation" of an address is one of the many textual forms that
> one finds in the wild. You need some non-rdf processing to relate each
> type of representation you need to handle to its location. It might
> turn out that each type of address could be expressed as a grammar (in
> B-N or some other notation) or at least by some syntax rules. If so,
> that notation type could be included as a property of the address instance.
>
> For some other time: fuzzy addresses like "on 5th Ave. between 72nd and
> 73rd streets".
>
> TomP
>
> On 12/4/2018 8:08 PM, Joshua Shinavier wrote:
> > Just to add another data point to the "addresses are hard" thread, at
> > Uber we have also invested quite some time into standardizing vocabulary
> > around addresses. Prior to standardization, there were many dozens of
> > address types in use within the company (and still are), most of which
> > are of the basic street/city/state/country/zip kind, similar to
> > schema.org <http://schema.org>'s PostalAddress. After a great deal of
> > discussion, we opted not to support such a format as a standard. Most of
> > the reasons for this boil down to items on the page Thomas linked.
> > Instead, we distinguish between structured addresses (a bag of
> > components which validate against any of a number of black-box address
> > schemas) and addresses for display. Google makes a similar distinction
> > in its Places API.. Address validation, formatting, normalization, etc.
> > are API concerns that go well beyond the vocabulary itself, requiring
> > significant background knowledge. I would not be optimistic about
> > finding canonical identifiers for addresses, though geocoded lat/lon is
> > probably the next best thing.
> >
> > Josh
> >
> >
> > On Tue, Dec 4, 2018 at 4:16 PM Dave Reynolds <dave.e.reynolds@gmail.com
> > <mailto:dave.e.reynolds@gmail.com>> wrote:
> >
> > Hi Hugh,
> >
> > On 04/12/2018 22:48, Hugh Glaser wrote:
> > > Thanks Dave.
> > > Yes, I agree with all the detail.
> > >
> > > My interpretation is that you are confirming what I was saying -
> > that the general case is a nightmare.
> >
> > On that we are agreed :)
> >
> > > This is a problem of trying for a standard for the addresses -
> > not only is it fiendishly complicated, but no standard will ever
> > satisfy all the reasons you might want to identify something, such
> > as an address.
> > > I agree, which is why I was negative about trying to capture it
> > centrally.
> > > On the other hand, SW people *are* representing addresses all the
> > time, using sufficient specificity for their purposes.
> > > And others will be doing the same thing to the same level.
> >
> > Sure, *representing* addresses is just fine. It's *identifying*
> > addresses that's hard.
> >
> > > And businesses in the UK find that the number/postcode pair is
> > pretty much all they need to deliver almost all online purchases.
> >
> > If you are only dealing with consumers, not other businesses, and
> > mostly
> > focus on houses in urban areas, and don't care about secondary
> > addresses
> > (saons - like flat number, unit number, floor etc), and if you only
> > care
> > about delivery (so there's a human at the other end interpreting the
> > address) and if we can agree to differ on the semantics of "almost
> > all"
> > then that's possibly true.
> >
> > However, many businesses, even under those constraints, solve it by
> > getting a human (the one placing an order) to do the matching. You
> use
> > number/postcode to constrain and order the search on your (very
> > expensive) master address list and get the user to pick the right one
> > from the result list. *Then* you have an identifier.
> >
> > > It seems to me that you are concerned with the "global" solution -
> >
> > No, simply pointing out that matching real world entities is hard for
> > domain specific reasons and no amount of RDF/OWL makes much
> difference
> > to that.
> >
> > Actually, all I was really doing was sharing painfully gathered
> > experience that in the UK, postcode + number is far from a nearly
> > unique
> > key for all addresses. Trust me on this. I've sacrificed a large
> > part of
> > the last three months to learning this lesson in great detail :(
> >
> > > I want to worry about a more local problem, and what small steps
> > can be taken to help people in common cases, so that SW & LD are
> > more useful for developers.
> >
> > I've lost track of how this thread about thing equality relates to
> the
> > goal of making SW/LD/RDF easier. Which is why I opened with "I don't
> > want to get embroiled in the main thread(s)" and just commented on
> the
> > nature of addresses.
> >
> > [While URIs can be off putting I don't think they are *that* much of
> a
> > problem for developers. Even where they are a barrier it's the
> > choice of
> > namespace that's the challenge ("you mean we have to host a DNS
> domain
> > and maintain it?"). In my experience most developers are very happy
> > with
> > the notion that some domains have "natural" composite keys that you
> can
> > use to identify things and some domains you have to do work to create
> > some (often human) process to manage your reference identifiers and
> > then
> > use those as keys. Once you have your keys, one way or another, then
> > creating identifiers by combining some sort of namespace with an
> > encoding/hash of the composite keys is bread and butter stuff, even
> > outside of SW/LD.]
> >
> > Dave
> >
> >
> > > Or are you saying that because specifying addresses as well as
> > you would like is so hard, we shouldn't bother trying to do
> > something simpler and useful for many purposes?
> >
> > > It is about URIs, and they aren't in the noise - they are the
> > things that people currently generate for themselves, and get little
> > or no help with that generation, or linking up.
> > >
> > >> On 4 Dec 2018, at 11:24, Dave Reynolds
> > <dave.e.reynolds@gmail.com <mailto:dave.e.reynolds@gmail.com>>
> wrote:
> > >>
> > >> I don't want to get embroiled in the main thread(s) but, just in
> > case anyone is *really* dealing with UK addresses rather than using
> > them as rhetorical examples, then ...
> > >>
> > >> On 03/12/2018 23:37, Anthony Moretti wrote:
> > >>> I see your point Hugh, especially in your case because for UK
> > addresses consisting of only house number and postcode structural
> > equality is sufficient for address equality. Decentralized will work
> > very well in that case.
> > >>
> > >> Sadly that's a long way from being true. UK addresses within a
> > postcode my be identified by house name, house name + number,
> > business name (with no house name or number at all), any of those
> > plus a secondary address etc etc. Even when there's a house "number"
> > sometimes its actually a number range not a single number and
> > there's considerable ambiguity on how those ranges are expressed and
> > what the "definitive" range for a given property really is.
> > >>
> > >> Identity of UK addresses is simply not something you can express
> > in OWL or any logic close to it. You need an address reconciliation
> > algorithm to map your address to an maintained identifier set such
> > as a UPRN or UDPRN. The reconciliation process will have error rates
> > that you will need to manage and recover from, there's no closed,
> > guaranteed algorithm.
> > >>
> > >> Once you have the UPRN or UDPRN or whatever you can create URI's
> > or some inverse functional property as you wish. Except that even
> > then the official identifier schemes like that aren't perfect and
> > have ... oddities ... in them that can still mess you up.
> > >>
> > >> Generating unique keys for resources based on hashing a few
> > properties is all very well in simple cases but, at least in my
> > experience, real world problems are nothing like that simple clean.
> > You need serious effort to create and maintain identifier schemes
> > and to reconcile source data against those schemes. Details like
> > URIs or bNodes seem to me rather down in the noise.
> > >>
> > >> Dave
> > >>
> > >>> On Mon, Dec 3, 2018 at 3:07 PM Nathan Rixham <nathan@webr3.org
> > <mailto:nathan@webr3.org> <mailto:nathan@webr3.org
> > <mailto:nathan@webr3.org>>> wrote:
> > >>> Hugh, do you mean something like bnode.id <http://bnode.id>
> > <http://bnode.id> =
> > >>> sha256(serialise(bnode))
> > >>> On Mon, 3 Dec 2018, 22:58 Hugh Glaser <hugh@glasers.org
> > <mailto:hugh@glasers.org>
> > >>> <mailto:hugh@glasers.org <mailto:hugh@glasers.org>> wrote:
> > >>> This is not directly about blank nodes, but is a reply
> to a
> > >>> message in the thread.
> > >>> I’m certainly agreeing that we should work towards
> common
> > >>> understanding of Thing equality.
> > >>> And addresses are a great place to start.
> > >>> In order for equality to be defined, I think that means
> you
> > >>> first need an idea of what an unambiguous address looks
> > like.
> > >>> Having an oracle that defines what an unambiguous Thing
> > looks
> > >>> like is one organisational structure, and it would be
> > great if
> > >>> schema.org <http://schema.org> <http://schema.org> could lead
> > the way.
> > >>> It particularly helps people who just want an off the
> shelf
> > >>> solution, especially if they have no knowledge of the
> > Thing domain.
> > >>> However I (and perhaps David Booth) am after something
> more
> > >>> anarchic, that can function in a decentralised way (if
> > I dare to
> > >>> use that term! :-) )
> > >>> For example, I might decide that I think that House
> > Number and
> > >>> PostCode is enough.
> > >>> (UK people will know that this is a commonly-used way of
> > >>> choosing an address, although it may well not be
> > satisfactory
> > >>> for some purposes, I’m sure.)
> > >>> That may well be sufficient for me to interwork with
> > datasets
> > >>> from Companies House, the Land Registry and a bunch of
> > other
> > >>> UK-based organisations, plus many other datasets.
> > >>> Having a simple standard way to create keys for such
> things
> > >>> facilitates that, without any standardisation process
> > and all
> > >>> that entails in weaknesses and strengths of trying to
> get
> > >>> agreement on what an unambiguous address might look
> > like on a
> > >>> world scale for all purposes.
> > >>> Just generating a URI, without needing to make any
> > service calls
> > >>> (having found where they are and chosen the one you
> > want and
> > >>> compromised on it, etc.) or anything seems to me a way
> > of making
> > >>> all the interlinking so much more accessible for us all.
> > >>> It is even future proof:- using such a URI means that
> > if it is
> > >>> about something new (UK postcodes change all the time
> > :-(, and
> > >>> there are more dead ones than live ones), the oracle
> > doesn’t
> > >>> tell me anything it didn’t have until I ask again.
> > >>> In a key-generating world, my new shiny key will slowly
> > align
> > >>> with all the other key URIs as they get created.
> > >>> So yeah, all strength to anyone who wants to take on
> > the central
> > >>> roles, but not at the expense of killing the anarchic
> > solution,
> > >>> please.
> > >>> Cheers
> > >>> > On 3 Dec 2018, at 22:10, Anthony Moretti
> > >>> <anthony.moretti@gmail.com
> > <mailto:anthony..moretti@gmail.com>
> > <mailto:anthony.moretti@gmail.com <mailto:anthony.moretti@gmail.com
> >>>
> > >>> wrote:
> > >>> >
> > >>> > Cheers for agreeing William. On the topic of
> > incomplete blank
> > >>> nodes Henry I'd give them another type, the partial
> address
> > >>> example you give I'd give the type AddressComponent, or
> > >>> something to that effect. I could be wrong, but it's
> > not a valid
> > >>> Address if it's a blank node and no other information
> > in the
> > >>> graph completes it.
> > >>> >
> > >>> > Anthony
> > >>> >
> > >>> > On Mon, Dec 3, 2018 at 1:56 PM William Waites
> > >>> <wwaites@tardis.ed.ac.uk
> > <mailto:wwaites@tardis.ed.ac.uk> <mailto:wwaites@tardis..ed.ac.uk
> > <mailto:wwaites@tardis.ed.ac.uk>>> wrote:
> > >>> > > standards like schema:PostalAddress should
> > possibly define
> > >>> relevant
> > >>> > > operations like equality checking too.
> > >>> >
> > >>> > Exactly.
> > >>> >
> > >>> >
> > >>
> > >
> >
>
>
>
Received on Thursday, 6 December 2018 14:02:10 UTC