Twice married to the same person problem, was [Re: owl:sameAs/referential opacity] from Jerven Tjalling Bolleman on 2020-10-30 (public-rdf-star@w3.org from October 2020)

From: Jerven Tjalling Bolleman <Jerven.Bolleman@sib.swiss>
Date: Fri, 30 Oct 2020 09:23:07 +0100
To: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Cc: thomas lörtsch <tl@rat.io>, public-rdf-star@w3.org
Message-ID: <15d039cbd31df1a45974f097f3826ea6@imap.sib.swiss>
Hi All,

I have been thinking a bit about the problem referential opacity
tries to solve and see if this is not a problem even when it is not
in play.


So let's descend into celebrity gossip land, sorry!

<< :pam :marriedTo :rick >> :begin "2007"^^xsd:gYear ;
                           :end "2008"^^xsd:gYear.

<< :pam :marriedTo :rick >> :begin "2014"^^xsd:gYear ;
                           :end "2015"^^xsd:gYear.

Here we use the same references(IRI) and still end up with an analogue
to the superman problem.

This is something that comes up in UniProt as well.

<< :protein :disease :badForYou >> :evidence :good ;
                                    :paper :X .


<< :protein :disease :badForYou >> :evidence :flaky ;
                                    :paper :Y .

<< :protein :disease :badForYou >> :evidence :negative ;
                                    :paper :Z .

Here too we need to deal with an extra level (which we already do with
our current reification approach).

So I was interested in how current property graphs solve this.
And basically they have to separate out the :clarkKent and the :superMan
appearance as well to be able to be able to avoid the problem.

Here actually we have new a strength for triplestarstores, as unlike PGs
we can use nodes and edges as properties on our edges, which they can't.

Regards,
Jerven

PS. we would do something like this in UniProt
<< :protein :disease :badForYou >> :attribution  [ :evidence :negative ;
                                    :paper :Z ] , [ :evidence :flaky ;
                                    :paper :Y ] , [ :evidence :negative ;
                                    :paper :Z ] .



On 2020-10-29 19:27, Pierre-Antoine Champin wrote:
> On 29/10/2020 13:14, thomas lörtsch wrote:
> 
>> On 29. Oct 2020, at 10:07, Pierre-Antoine Champin
>> <pierre-antoine.champin@ercim.eu> wrote:
>> 
>> On 29/10/2020 01:14, thomas lörtsch wrote:
>> 
>> Pierre-Antoine,
>> 
>> I sympathize with the goal of referential opacity but it seems like
>> you have to work hard against the mechanics of RDF to achieve
>> something within RDF that RDF by its very nature does not only not
>> support but rather aims to get past.
>> 
>> I wouldn't go as far as stating that RDF aims to get past
>> referential
>> opacity. For example, Notation 3 has managed since the early days to
>> extends RDF with a referential opacity. But that's another topic :)
>> 
>> Another question is if the need for referential opacity isn’t
>> rather special and therefor could better be realized through a
>> special mechanism rather than as the guiding design principle.
>> Provenance is only one use case for statement annotation and even
>> then the demands are usually not as extreme as your semantics try to
>> support.
>> 
>> Granted. I created a strawpoll to evaluate how much this feature is
>> required.
>> 
>> https://github.com/w3c/rdf-star/issues/22
>> 
>> The argument that it’s easier to add functionality than to take it
>> away later sounds good and true but maybe it’s not when the system
>> within which you are working - RDF - is already based on another
>> paradigm.
>> 
>> Why not go another route and document the original statement as a
>> string
>> 
>> Because asking something like "Does Alice says anything about
>> Paris's
>> population" would become cumbersome. But apart from that, of course,
>> this is a way to do it in standard RDF.
> 
> Could be done as an additional annotation (e.g. like in line 3), added
> to a statement that is otherwise available to entailment:
> 
> << dbr:Paris dbo:populationTotal 2229621>>
>     :assertedBy <http://dbpedia.org> [1] ;
>     :sourceText "dbr:Paris dbo:populationTotal 2229621".
> << geo:2229621 gn:population 2138551> [2]> :assertedBy
> <http://geonames.org> [3] .
> dbr:Paris owl:sameAs geo:2229621.
> << geo:2229621 dbo:populationTotal 2229621> [4]> :assertedBy
> <http://dbpedia.org> [1] .
> << dbr:Paris gn:population 2138551>> :assertedBy <http://geonames.org>
> [3] .
> 
> Maybe that would suffice?
> 
> Yes, that could be a reasonable trade-off I guess.
> 
> Note that embedded are uniquely defined by their subject, predicate
> and object, so that should rather be:
> 
> << dbr:Paris dbo:populationTotal 2229621>>
>     :asserted [
>        :by <http://dbpedia.org> [1] ;
>        :sourceText "dbr:Paris dbo:populationTotal 2229621"
>     ].
> 
> So that several different assertions of the same triple could be
> captured. But that's another issue.
> 
>> - withdrawn from all greedy reasoners - when the need arises and in
>> all other cases just live with the unavoidable unhelpful entailment
>> now and then. That’s actually not my idea but yours, below ;-)
>> 
>> On 28. Oct 2020, at 19:31, Pierre-Antoine Champin
>> <pierre-antoine.champin@ercim.eu> wrote:
>> 
>> (...)
>> 
>> RDF(S) semantics makes no distinction between "stated triples" and
>> "inferred triples". So unless we change the semantics of RDF (!),
>> 
>> !!
> 
> Yes, I wrote that, and you seem to imply that I am contradicting
> myself,
> 
> No (although, cheekily, yes ;-), it’s just what got me thinking!
> 
>> but I don't think I am ;-)
> 
>> RDF(S) semantics knows nothing about "embedded triples", which are
>> neither "stated" (I should probably have written "asserted") nor
>> "inferred". So it is up to us to decide how this new kind of triples
>> should be handled. This is what this whole discussion is about.
> 
> I don’t mean to contradict. It rather seems to me that the topic has
> so many facets that one can get to quite different results depending
> on from which side one approaches it.
> 
> You seem to think of statements and "their" IRIs as connected in time
> and space from the point a statement is constructed to the point(s)
> its IRI is used. So they have to be consistent all the way, blank node
> identities have to survive relabeling etc. The goal is reasonable but
> maybe the conceptualization takes one shortcut too much. Meta
> modelling always introduces a break somewhere in the process, it
> always involves "taking a step back" at some point. So what if we
> introduce a corresponding abstraction into our machinery: a
> distinction between a statement identifier and its representation as
> an IRI.
> 
> I’m thinking of a statement identifier that is always "with" the
> statement - just a handle and necessarily always reflecting any
> changes to the statement like e.g. blank node relabeling. Such
> statememt identifiers are implemented by many triple stores already,
> some even expose them via their APIs. This identifier can be used in
> other statements to refer to the statement in question.
> 
> Only when serialization is required will an actual IRI, representing
> the statement, be generated. At this point any blank nodes occurring
> in the statement will be serialized to a value valid in the
> serialization context at that time. Blank node skolemization may be
> used if deemed appropriate to ensure more stable and wider ranging
> reference.
> 
> IIUC that’s the process Holger has sketched to avoid "long" IRIs and
> streamline processing, querying etc. Technically the statement
> identifier resembles more a blank node than an IRI as its real value
> is internal to the system and only exposed to the user if the need
> arises. Like Holger said it probably needs another index internally -
> which is an implementation detail but probably not negligable.
> 
> Technically the serialized identifier doesn’t need to be an IRI but
> there seems to be wide agreement that this would be favorable (and I
> totally concur). How exactly that IRI should be constructed will of
> course be an interesting discussion on its own :-)
> 
> See my reply to Holger regarding this:
> 
> https://lists.w3.org/Archives/Public/public-rdf-star/2020Oct/0069.html
> 
> 
> my problem with this approach is that it is presented (IIUC) as
> syntactic sugar, in contrast to the _extension_ of the RDF model
> currently proposed by the draft, but in practice, it is actually
> extending the model.
> 
> For the record, I am beginning to think that it would be possible to
> define RDF* as syntactic sugar (I'll share that when my ideas are
> clearer), but the "long URI" still does not convince me.
> 
>   best
> 
>> Thomas
> 
> 
> Links:
> ------
> [1] http://dbpedia.org
> [2] geo:2229621gn:population2138551
> [3] http://geonames.org
> [4] geo:2229621dbo:populationTotal2229621

-- 
Jerven Tjalling Bolleman
SIB | Swiss Institute of Bioinformatics
CMU - 1, rue Michel Servet - 1211 Geneva 4
t: +41 22 379 58 85 - f: +41 22 379 58 58
Jerven.Bolleman@sib.swiss - http://www.sib.swiss
Received on Friday, 30 October 2020 08:23:47 UTC