RE: Twice married to the same person problem, was [Re: owl:sameAs/referential opacity] from Storm, Jonathon on 2020-10-30 (public-rdf-star@w3.org from October 2020)

From: Storm, Jonathon <jonathon.storm@spglobal.com>
Date: Fri, 30 Oct 2020 11:20:20 +0000
To: "Jerven.Bolleman@sib.swiss" <Jerven.Bolleman@sib.swiss>, "Pierre-Antoine Champin" <pierre-antoine.champin@ercim.eu>
CC: thomas lörtsch <tl@rat.io>, "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Message-ID: <BN8PR10MB32037E964D91354ED2980288F3150@BN8PR10MB3203.namprd10.prod.outlook.com>
Hi all,

I've not been following these various threads too closely, as it is mostly beyond my learning. So take my opinion as being, perhaps, not fully informed.

I like the simple layout of Jerven's, here. It is simple and easy to parse and understand.

<< :protein :disease :badForYou >> :attribution  [ :evidence :negative ;
                                    :paper :Z ] , [ :evidence :flaky ;
                                    :paper :Y ] , [ :evidence :negative ;
                                    :paper :Z ] .

Best,
Jonathon Storm
Pronouns: traditional (he, him, his)
Lead Data Architect, Core Data Architecture, S&P Global Market Intelligence

S&P Global
C: 757-284-7786
jonathon.storm@spglobal.com
www.spglobal.com

-----Original Message-----
From: Jerven Tjalling Bolleman <Jerven.Bolleman@sib.swiss>
Sent: Friday, October 30, 2020 4:23 AM
To: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Cc: thomas lörtsch <tl@rat.io>; public-rdf-star@w3.org
Subject: Twice married to the same person problem, was [Re: owl:sameAs/referential opacity]

EXTERNAL MESSAGE



Hi All,

I have been thinking a bit about the problem referential opacity tries to solve and see if this is not a problem even when it is not in play.


So let's descend into celebrity gossip land, sorry!

<< :pam :marriedTo :rick >> :begin "2007"^^xsd:gYear ;
                           :end "2008"^^xsd:gYear.

<< :pam :marriedTo :rick >> :begin "2014"^^xsd:gYear ;
                           :end "2015"^^xsd:gYear.

Here we use the same references(IRI) and still end up with an analogue to the superman problem.

This is something that comes up in UniProt as well.

<< :protein :disease :badForYou >> :evidence :good ;
                                    :paper :X .


<< :protein :disease :badForYou >> :evidence :flaky ;
                                    :paper :Y .

<< :protein :disease :badForYou >> :evidence :negative ;
                                    :paper :Z .

Here too we need to deal with an extra level (which we already do with our current reification approach).

So I was interested in how current property graphs solve this.
And basically they have to separate out the :clarkKent and the :superMan appearance as well to be able to be able to avoid the problem.

Here actually we have new a strength for triplestarstores, as unlike PGs we can use nodes and edges as properties on our edges, which they can't.

Regards,
Jerven

PS. we would do something like this in UniProt << :protein :disease :badForYou >> :attribution  [ :evidence :negative ;
                                    :paper :Z ] , [ :evidence :flaky ;
                                    :paper :Y ] , [ :evidence :negative ;
                                    :paper :Z ] .



On 2020-10-29 19:27, Pierre-Antoine Champin wrote:
> On 29/10/2020 13:14, thomas lörtsch wrote:
>
>> On 29. Oct 2020, at 10:07, Pierre-Antoine Champin
>> <pierre-antoine.champin@ercim.eu> wrote:
>>
>> On 29/10/2020 01:14, thomas lörtsch wrote:
>>
>> Pierre-Antoine,
>>
>> I sympathize with the goal of referential opacity but it seems like
>> you have to work hard against the mechanics of RDF to achieve
>> something within RDF that RDF by its very nature does not only not
>> support but rather aims to get past.
>>
>> I wouldn't go as far as stating that RDF aims to get past referential
>> opacity. For example, Notation 3 has managed since the early days to
>> extends RDF with a referential opacity. But that's another topic :)
>>
>> Another question is if the need for referential opacity isn’t rather
>> special and therefor could better be realized through a special
>> mechanism rather than as the guiding design principle.
>> Provenance is only one use case for statement annotation and even
>> then the demands are usually not as extreme as your semantics try to
>> support.
>>
>> Granted. I created a strawpoll to evaluate how much this feature is
>> required.
>>
>> https://github.com/w3c/rdf-star/issues/22

>>
>> The argument that it’s easier to add functionality than to take it
>> away later sounds good and true but maybe it’s not when the system
>> within which you are working - RDF - is already based on another
>> paradigm.
>>
>> Why not go another route and document the original statement as a
>> string
>>
>> Because asking something like "Does Alice says anything about Paris's
>> population" would become cumbersome. But apart from that, of course,
>> this is a way to do it in standard RDF.
>
> Could be done as an additional annotation (e.g. like in line 3), added
> to a statement that is otherwise available to entailment:
>
> << dbr:Paris dbo:populationTotal 2229621>>
>     :assertedBy <http://dbpedia.org> [1] ;
>     :sourceText "dbr:Paris dbo:populationTotal 2229621".
> << geo:2229621 gn:population 2138551> [2]> :assertedBy
> <http://geonames.org> [3] .
> dbr:Paris owl:sameAs geo:2229621.
> << geo:2229621 dbo:populationTotal 2229621> [4]> :assertedBy
> <http://dbpedia.org> [1] .
> << dbr:Paris gn:population 2138551>> :assertedBy <http://geonames.org>
> [3] .
>
> Maybe that would suffice?
>
> Yes, that could be a reasonable trade-off I guess.
>
> Note that embedded are uniquely defined by their subject, predicate
> and object, so that should rather be:
>
> << dbr:Paris dbo:populationTotal 2229621>>
>     :asserted [
>        :by <http://dbpedia.org> [1] ;
>        :sourceText "dbr:Paris dbo:populationTotal 2229621"
>     ].
>
> So that several different assertions of the same triple could be
> captured. But that's another issue.
>
>> - withdrawn from all greedy reasoners - when the need arises and in
>> all other cases just live with the unavoidable unhelpful entailment
>> now and then. That’s actually not my idea but yours, below ;-)
>>
>> On 28. Oct 2020, at 19:31, Pierre-Antoine Champin
>> <pierre-antoine.champin@ercim.eu> wrote:
>>
>> (...)
>>
>> RDF(S) semantics makes no distinction between "stated triples" and
>> "inferred triples". So unless we change the semantics of RDF (!),
>>
>> !!
>
> Yes, I wrote that, and you seem to imply that I am contradicting
> myself,
>
> No (although, cheekily, yes ;-), it’s just what got me thinking!
>
>> but I don't think I am ;-)
>
>> RDF(S) semantics knows nothing about "embedded triples", which are
>> neither "stated" (I should probably have written "asserted") nor
>> "inferred". So it is up to us to decide how this new kind of triples
>> should be handled. This is what this whole discussion is about.
>
> I don’t mean to contradict. It rather seems to me that the topic has
> so many facets that one can get to quite different results depending
> on from which side one approaches it.
>
> You seem to think of statements and "their" IRIs as connected in time
> and space from the point a statement is constructed to the point(s)
> its IRI is used. So they have to be consistent all the way, blank node
> identities have to survive relabeling etc. The goal is reasonable but
> maybe the conceptualization takes one shortcut too much. Meta
> modelling always introduces a break somewhere in the process, it
> always involves "taking a step back" at some point. So what if we
> introduce a corresponding abstraction into our machinery: a
> distinction between a statement identifier and its representation as
> an IRI.
>
> I’m thinking of a statement identifier that is always "with" the
> statement - just a handle and necessarily always reflecting any
> changes to the statement like e.g. blank node relabeling. Such
> statememt identifiers are implemented by many triple stores already,
> some even expose them via their APIs. This identifier can be used in
> other statements to refer to the statement in question.
>
> Only when serialization is required will an actual IRI, representing
> the statement, be generated. At this point any blank nodes occurring
> in the statement will be serialized to a value valid in the
> serialization context at that time. Blank node skolemization may be
> used if deemed appropriate to ensure more stable and wider ranging
> reference.
>
> IIUC that’s the process Holger has sketched to avoid "long" IRIs and
> streamline processing, querying etc. Technically the statement
> identifier resembles more a blank node than an IRI as its real value
> is internal to the system and only exposed to the user if the need
> arises. Like Holger said it probably needs another index internally -
> which is an implementation detail but probably not negligable.
>
> Technically the serialized identifier doesn’t need to be an IRI but
> there seems to be wide agreement that this would be favorable (and I
> totally concur). How exactly that IRI should be constructed will of
> course be an interesting discussion on its own :-)
>
> See my reply to Holger regarding this:
>
> https://lists.w3.org/Archives/Public/public-rdf-star/2020Oct/0069.html

>
>
> my problem with this approach is that it is presented (IIUC) as
> syntactic sugar, in contrast to the _extension_ of the RDF model
> currently proposed by the draft, but in practice, it is actually
> extending the model.
>
> For the record, I am beginning to think that it would be possible to
> define RDF* as syntactic sugar (I'll share that when my ideas are
> clearer), but the "long URI" still does not convince me.
>
>   best
>
>> Thomas
>
>
> Links:
> ------
> [1] http://dbpedia.org

> [2] geo:2229621gn:population2138551
> [3] http://geonames.org

> [4] geo:2229621dbo:populationTotal2229621

--
Jerven Tjalling Bolleman
SIB | Swiss Institute of Bioinformatics
CMU - 1, rue Michel Servet - 1211 Geneva 4
t: +41 22 379 58 85 - f: +41 22 379 58 58 Jerven.Bolleman@sib.swiss - http://www.sib.swiss




________________________________

The information contained in this message is intended only for the recipient, and may be a confidential attorney-client communication or may otherwise be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, please be aware that any dissemination or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify us by replying to the message and deleting it from your computer. S&P Global Inc. reserves the right, subject to applicable local law, to monitor, review and process the content of any electronic message or information sent to or from S&P Global Inc. e-mail addresses without informing the sender or recipient of the message. By sending electronic message or information to S&P Global Inc. e-mail addresses you, as the sender, are consenting to S&P Global Inc. processing any of your personal data therein.
Received on Friday, 30 October 2020 16:06:32 UTC