Re: owl:sameAs/referential opacity Re: Can RDFstar be defined as only syntactic sugar on top of RDF (Re: weakness of embedded triples) from thomas lörtsch on 2020-10-29 (public-rdf-star@w3.org from October 2020)

From: thomas lörtsch <tl@rat.io>
Date: Thu, 29 Oct 2020 13:14:23 +0100
To: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Cc: Jerven.Bolleman@sib.swiss, public-rdf-star@w3.org
Message-Id: <B964AD8C-16ED-4379-98E0-5E8212431D9D@rat.io>
> On 29. Oct 2020, at 10:07, Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu> wrote:
> 
> On 29/10/2020 01:14, thomas lörtsch wrote:
> 
>> Pierre-Antoine,
>> 
>> I sympathize with the goal of referential opacity but it seems like you have to work hard against the mechanics of RDF to achieve something within RDF that RDF by its very nature does not only not support but rather aims to get past.
> I wouldn't go as far as stating that RDF aims to get past referential
> opacity. For example, Notation 3 has managed since the early days to
> extends RDF with a referential opacity. But that's another topic :)
>> Another question is if the need for referential opacity isn’t rather special and therefor could better be realized through a special mechanism rather than as the guiding design principle. Provenance is only one use case for statement annotation and even then the demands are usually not as extreme as your semantics try to support.
> 
> Granted. I created a strawpoll to evaluate how much this feature is
> required.
> 
> https://github.com/w3c/rdf-star/issues/22
> 
>> The argument that it’s easier to add functionality than to take it away later sounds good and true but maybe it’s not when the system within which you are working - RDF - is already based on another paradigm.
>> 
>> Why not go another route and document the original statement as a string
> Because asking something like "Does Alice says anything about Paris's
> population" would become cumbersome. But apart from that, of course,
> this is a way to do it in standard RDF.

Could be done as an additional annotation (e.g. like in line 3), added to a statement that is otherwise available to entailment:

<< dbr:Paris dbo:populationTotal 2229621>>
    :assertedBy <http://dbpedia.org> ;
    :sourceText "dbr:Paris dbo:populationTotal 2229621".
<< geo:2229621 gn:population 2138551>> :assertedBy <http://geonames.org> .
dbr:Paris owl:sameAs geo:2229621.
<< geo:2229621 dbo:populationTotal 2229621>> :assertedBy <http://dbpedia.org> .
<< dbr:Paris gn:population 2138551>> :assertedBy <http://geonames.org> .

Maybe that would suffice?

>> - withdrawn from all greedy reasoners - when the need arises and in all other cases just live with the unavoidable unhelpful entailment now and then. That’s actually not my idea but yours, below ;-)
>>> On 28. Oct 2020, at 19:31, Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu> wrote:
>>> 
>>> (...)
>>> 
>>> RDF(S) semantics makes no distinction between "stated triples" and "inferred triples". So unless we change the semantics of RDF (!),
>> !!
> 
> Yes, I wrote that, and you seem to imply that I am contradicting myself,

No (although, cheekily, yes ;-), it’s just what got me thinking!

> but I don't think I am ;-)

> RDF(S) semantics knows nothing about "embedded triples", which are
> neither "stated" (I should probably have written "asserted") nor
> "inferred". So it is up to us to decide how this new kind of triples
> should be handled. This is what this whole discussion is about.

I don’t mean to contradict. It rather seems to me that the topic has so many facets that one can get to quite different results depending on from which side one approaches it.

You seem to think of statements and "their" IRIs as connected in time and space from the point a statement is constructed to the point(s) its IRI is used. So they have to be consistent all the way, blank node identities have to survive relabeling etc. The goal is reasonable but maybe the conceptualization takes one shortcut too much. Meta modelling always introduces a break somewhere in the process, it always involves "taking a step back" at some point. So what if we introduce a corresponding abstraction into our machinery: a distinction between a statement identifier and its representation as an IRI.

I’m thinking of a statement identifier that is always "with" the statement - just a handle and necessarily always reflecting any changes to the statement like e.g. blank node relabeling. Such statememt identifiers are implemented by many triple stores already, some even expose them via their APIs. This identifier can be used in other statements to refer to the statement in question.

Only when serialization is required will an actual IRI, representing the statement, be generated. At this point any blank nodes occurring in the statement will be serialized to a value valid in the serialization context at that time. Blank node skolemization may be used if deemed appropriate to ensure more stable and wider ranging reference.

IIUC that’s the process Holger has sketched to avoid "long" IRIs and streamline processing, querying etc. Technically the statement identifier resembles more a blank node than an IRI as its real value is internal to the system and only exposed to the user if the need arises. Like Holger said it probably needs another index internally - which is an implementation detail but probably not negligable.

Technically the serialized identifier doesn’t need to be an IRI but there seems to be wide agreement that this would be favorable (and I totally concur). How exactly that IRI should be constructed will of course be an interesting discussion on its own :-)

Thomas
Received on Thursday, 29 October 2020 12:14:56 UTC