W3C home > Mailing lists > Public > semantic-web@w3.org > April 2016

Re: Deprecating owl:sameAs

From: Paul Houle <ontology2@gmail.com>
Date: Fri, 1 Apr 2016 09:58:21 -0400
Message-ID: <CAE__kdSmgPY7YveKfB3=kTqFQNoB9FmKYXW_M8H+GwzRcjTVXQ@mail.gmail.com>
To: Barry Norton <barrynorton@gmail.com>
Cc: Sarven Capadisli <info@csarven.ca>, Linking Open Data <public-lod@w3.org>, SW-forum <semantic-web@w3.org>
It is not about stopping building naive applications,  it is about starting
to build smart applications..

Trust and provenance will only get you so far.  It can easily be the royal
road to becoming very good at seeing the Emperor's clothes.  Even
authoritative sources often have singularities or mismatches that make
basic invariants you'd expect wrong.

Decades ago my friends and I were reading the CIA World Fact book on a
Friday night and thinking how profound it was that there was a $100 billion
excess of global "exports" over "imports" and perhaps we'd stumbled on
evidence of extraterrestrial life or perhaps a secret civilization hidden

Eventually we figured it was that some of the exports wind up on the ocean
floor,  washing up to shore,  or stuck for centuries in gyres.  Also there
is a ratchet effect that pirates, government officials and other thieves
are more likely to remove valuable exports from ships and warehouses than
deposit them and so forth.

Now the accountants have gone through 15 years of blood, sweat and tears to
get XBRL financial reports which are logically sound 99% of the time for
U.S. public companies.  It is a problem for financial reports,  if you are
preparing them for the state,  the bank,  investors,  etc. and these
invariants are not met.

Structurally all kinds of demographic and similar numbers can be hypercubed
like XBRL but for a whole bunch of reasons,  will defy reason and never
quite "add up" when you compare multiple sources.  (I can point to a census
block where 200 people did not get counted because I didn't count them;
 the World Bank numbers for Nigeria are implausible for many reasons,  etc.)

As Reagan said it,  "Trust but verify" and that the essence of being a
reasonable animal.

Compare your input data with itself,  against its requirements,  against
the experience of the system and its users and you will find your
(system's) truth.

On Fri, Apr 1, 2016 at 9:16 AM, Barry Norton <barrynorton@gmail.com> wrote:

> Or we could stop building naive applications that treat assertion as fact,
> and instead only reason on statements we accept based on trust and
> provenance. Wasn't that the plan?
> Regards,
> Barry
> On Fri, Apr 1, 2016 at 2:01 PM, Sarven Capadisli <info@csarven.ca> wrote:
>> There is overwhelming research [1, 2, 3] and I think it is evident at
>> this point that owl:sameAs is used inarticulately in the LOD cloud.
>> The research that I've done makes me conclude that we need to do a
>> massive sweep of the LOD cloud and adopt owl:sameSameButDifferent.
>> I think the terminology is human-friendly enough that there will be
>> minimal confusion down the line, but for the the pedants among us, we can
>> define it along the lines of:
>> The built-in OWL property owl:sameSameButDifferent links things to
>> things. Such an owl:sameSameButDifferent statement indicates that two URI
>> references actually refer to the same thing but may be different under some
>> circumstances.
>> Thoughts?
>> [1] https://www.w3.org/2009/12/rdf-ws/papers/ws21
>> [2] http://www.bbc.co.uk/ontologies/coreconcepts#terms_sameAs
>> [3] http://schema.org/sameAs
>> -Sarven
>> http://csarven.ca/#i

Paul Houle

*Applying Schemas for Natural Language Processing, Distributed Systems,
Classification and Text Mining and Data Lakes*

(607) 539 6254    paul.houle on Skype   ontology2@gmail.com

:BaseKB -- Query Freebase Data With SPARQL

Legal Entity Identifier Lookup

Join our Data Lakes group on LinkedIn
Received on Friday, 1 April 2016 13:58:50 UTC

This archive was generated by hypermail 2.3.1 : Friday, 1 April 2016 13:58:53 UTC