W3C home > Mailing lists > Public > public-rdf-star@w3.org > October 2020

Twice married to the same person problem, was [Re: owl:sameAs/referential opacity]

From: Jerven Tjalling Bolleman <Jerven.Bolleman@sib.swiss>
Date: Fri, 30 Oct 2020 09:23:07 +0100
To: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Cc: thomas lörtsch <tl@rat.io>, public-rdf-star@w3.org
Message-ID: <15d039cbd31df1a45974f097f3826ea6@imap.sib.swiss>
Hi All,

I have been thinking a bit about the problem referential opacity
tries to solve and see if this is not a problem even when it is not
in play.

So let's descend into celebrity gossip land, sorry!

<< :pam :marriedTo :rick >> :begin "2007"^^xsd:gYear ;
                           :end "2008"^^xsd:gYear.

<< :pam :marriedTo :rick >> :begin "2014"^^xsd:gYear ;
                           :end "2015"^^xsd:gYear.

Here we use the same references(IRI) and still end up with an analogue
to the superman problem.

This is something that comes up in UniProt as well.

<< :protein :disease :badForYou >> :evidence :good ;
                                    :paper :X .

<< :protein :disease :badForYou >> :evidence :flaky ;
                                    :paper :Y .

<< :protein :disease :badForYou >> :evidence :negative ;
                                    :paper :Z .

Here too we need to deal with an extra level (which we already do with
our current reification approach).

So I was interested in how current property graphs solve this.
And basically they have to separate out the :clarkKent and the :superMan
appearance as well to be able to be able to avoid the problem.

Here actually we have new a strength for triplestarstores, as unlike PGs
we can use nodes and edges as properties on our edges, which they can't.


PS. we would do something like this in UniProt
<< :protein :disease :badForYou >> :attribution  [ :evidence :negative ;
                                    :paper :Z ] , [ :evidence :flaky ;
                                    :paper :Y ] , [ :evidence :negative ;
                                    :paper :Z ] .

On 2020-10-29 19:27, Pierre-Antoine Champin wrote:
> On 29/10/2020 13:14, thomas lörtsch wrote:
>> On 29. Oct 2020, at 10:07, Pierre-Antoine Champin
>> <pierre-antoine.champin@ercim.eu> wrote:
>> On 29/10/2020 01:14, thomas lörtsch wrote:
>> Pierre-Antoine,
>> I sympathize with the goal of referential opacity but it seems like
>> you have to work hard against the mechanics of RDF to achieve
>> something within RDF that RDF by its very nature does not only not
>> support but rather aims to get past.
>> I wouldn't go as far as stating that RDF aims to get past
>> referential
>> opacity. For example, Notation 3 has managed since the early days to
>> extends RDF with a referential opacity. But that's another topic :)
>> Another question is if the need for referential opacity isn’t
>> rather special and therefor could better be realized through a
>> special mechanism rather than as the guiding design principle.
>> Provenance is only one use case for statement annotation and even
>> then the demands are usually not as extreme as your semantics try to
>> support.
>> Granted. I created a strawpoll to evaluate how much this feature is
>> required.
>> https://github.com/w3c/rdf-star/issues/22
>> The argument that it’s easier to add functionality than to take it
>> away later sounds good and true but maybe it’s not when the system
>> within which you are working - RDF - is already based on another
>> paradigm.
>> Why not go another route and document the original statement as a
>> string
>> Because asking something like "Does Alice says anything about
>> Paris's
>> population" would become cumbersome. But apart from that, of course,
>> this is a way to do it in standard RDF.
> Could be done as an additional annotation (e.g. like in line 3), added
> to a statement that is otherwise available to entailment:
> << dbr:Paris dbo:populationTotal 2229621>>
>     :assertedBy <http://dbpedia.org> [1] ;
>     :sourceText "dbr:Paris dbo:populationTotal 2229621".
> << geo:2229621 gn:population 2138551> [2]> :assertedBy
> <http://geonames.org> [3] .
> dbr:Paris owl:sameAs geo:2229621.
> << geo:2229621 dbo:populationTotal 2229621> [4]> :assertedBy
> <http://dbpedia.org> [1] .
> << dbr:Paris gn:population 2138551>> :assertedBy <http://geonames.org>
> [3] .
> Maybe that would suffice?
> Yes, that could be a reasonable trade-off I guess.
> Note that embedded are uniquely defined by their subject, predicate
> and object, so that should rather be:
> << dbr:Paris dbo:populationTotal 2229621>>
>     :asserted [
>        :by <http://dbpedia.org> [1] ;
>        :sourceText "dbr:Paris dbo:populationTotal 2229621"
>     ].
> So that several different assertions of the same triple could be
> captured. But that's another issue.
>> - withdrawn from all greedy reasoners - when the need arises and in
>> all other cases just live with the unavoidable unhelpful entailment
>> now and then. That’s actually not my idea but yours, below ;-)
>> On 28. Oct 2020, at 19:31, Pierre-Antoine Champin
>> <pierre-antoine.champin@ercim.eu> wrote:
>> (...)
>> RDF(S) semantics makes no distinction between "stated triples" and
>> "inferred triples". So unless we change the semantics of RDF (!),
>> !!
> Yes, I wrote that, and you seem to imply that I am contradicting
> myself,
> No (although, cheekily, yes ;-), it’s just what got me thinking!
>> but I don't think I am ;-)
>> RDF(S) semantics knows nothing about "embedded triples", which are
>> neither "stated" (I should probably have written "asserted") nor
>> "inferred". So it is up to us to decide how this new kind of triples
>> should be handled. This is what this whole discussion is about.
> I don’t mean to contradict. It rather seems to me that the topic has
> so many facets that one can get to quite different results depending
> on from which side one approaches it.
> You seem to think of statements and "their" IRIs as connected in time
> and space from the point a statement is constructed to the point(s)
> its IRI is used. So they have to be consistent all the way, blank node
> identities have to survive relabeling etc. The goal is reasonable but
> maybe the conceptualization takes one shortcut too much. Meta
> modelling always introduces a break somewhere in the process, it
> always involves "taking a step back" at some point. So what if we
> introduce a corresponding abstraction into our machinery: a
> distinction between a statement identifier and its representation as
> an IRI.
> I’m thinking of a statement identifier that is always "with" the
> statement - just a handle and necessarily always reflecting any
> changes to the statement like e.g. blank node relabeling. Such
> statememt identifiers are implemented by many triple stores already,
> some even expose them via their APIs. This identifier can be used in
> other statements to refer to the statement in question.
> Only when serialization is required will an actual IRI, representing
> the statement, be generated. At this point any blank nodes occurring
> in the statement will be serialized to a value valid in the
> serialization context at that time. Blank node skolemization may be
> used if deemed appropriate to ensure more stable and wider ranging
> reference.
> IIUC that’s the process Holger has sketched to avoid "long" IRIs and
> streamline processing, querying etc. Technically the statement
> identifier resembles more a blank node than an IRI as its real value
> is internal to the system and only exposed to the user if the need
> arises. Like Holger said it probably needs another index internally -
> which is an implementation detail but probably not negligable.
> Technically the serialized identifier doesn’t need to be an IRI but
> there seems to be wide agreement that this would be favorable (and I
> totally concur). How exactly that IRI should be constructed will of
> course be an interesting discussion on its own :-)
> See my reply to Holger regarding this:
> https://lists.w3.org/Archives/Public/public-rdf-star/2020Oct/0069.html
> my problem with this approach is that it is presented (IIUC) as
> syntactic sugar, in contrast to the _extension_ of the RDF model
> currently proposed by the draft, but in practice, it is actually
> extending the model.
> For the record, I am beginning to think that it would be possible to
> define RDF* as syntactic sugar (I'll share that when my ideas are
> clearer), but the "long URI" still does not convince me.
>   best
>> Thomas
> Links:
> ------
> [1] http://dbpedia.org
> [2] geo:2229621gn:population2138551
> [3] http://geonames.org
> [4] geo:2229621dbo:populationTotal2229621

Jerven Tjalling Bolleman
SIB | Swiss Institute of Bioinformatics
CMU - 1, rue Michel Servet - 1211 Geneva 4
t: +41 22 379 58 85 - f: +41 22 379 58 58
Jerven.Bolleman@sib.swiss - http://www.sib.swiss
Received on Friday, 30 October 2020 08:23:47 UTC

This archive was generated by hypermail 2.4.0 : Friday, 30 October 2020 08:23:48 UTC