Re: RDF* and conjectures from thomas lörtsch on 2021-09-21 (public-rdf-star@w3.org from September 2021)

From: thomas lörtsch <tl@rat.io>
Date: Tue, 21 Sep 2021 13:30:47 +0200
To: James Anderson <anderson.james.1955@gmail.com>
Cc: "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Message-Id: <16EC59B5-DE51-41B6-A804-B60DD296B479@rat.io>
> On 21. Sep 2021, at 00:54, James Anderson <anderson.james.1955@gmail.com> wrote:
> 
> good evening;
> 
>> On 2021-09-20, at 23:15:03, thomas lörtsch <tl@rat.io> wrote:
>> 
>> ...
>>>> 2: it does answer this one, treating graphs as independent contexts
>>> 
>>> they are distinctly designated sets of statement to be combined into the dataset as per the prolog and made the target of graph patterns dependent on context.
>> 
>> What prolog are you referring to?
> 
> the term was inaccurate. the grammar calls it the "DatasetClause".
> 
>    https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#rDatasetClause
> 
>> I guess you mean the FROM and FROM NAMED clauses that define the scope of the query.  That's not the aspect I'm trying to understand. Example 24 in the RDF 1.1 WG Note on the semantics of datasets [https://www.w3.org/TR/2014/NOTE-rdf11-datasets-20140225/#relationship-with-sparql-entailment-regime] shows how the RDFS entailment regime is employed per graph but not across graphs. That’s what led me to the above conclusion.
> 
> once the processor has applied the instructions in the dataset clause, at any given point when a graph pattern is being unified, there is just one target graph.

Now I get your point. Yes, you’re right. So the Note’s argument doesn’t only apply to named graphs but to all sets of statements that can be targeted in SPARQL. Makes sense, of course.

>>>> 3: it does answer this one, treating graphs as referentially transparent
>>> 
>>> this is not definitively answered as the combination method is suggested, but not stipulated.
>> 
>> As in the previous question I’m not sure we speak about the same topic. The scope of the query - as per my understanding of your previous answer - IMO has nothing to do with it. I see only two possible answers to this question: the statements contained in a graph are either interpreted (as any regular RDF statement, irrespective of the entailment regime chosen) or quoted. Now reading more documents, SPARQL 1.1 Entailment Regimes [https://www.w3.org/TR/2013/REC-sparql11-entailment-20130321/] says in the introdction that
>> 
>>   The SPARQL 1.1 Query specification [SPARQL 1.1 Query] defines the evaluation of a basic graph pattern by means of subgraph matching. This form of basic graph pattern evaluation is also called simple entailment since it can equally be defined in terms of the simple entailment relation between RDF graphs.
> 
> where does that allow "or quoted"?

Nowhere, but I needed to state that for completeness. See below.

>> Other more elaborate entailment regimes like RDFS-entailment of course work in the realm of interpretation, not on quoted literals. I’d say that confirms that SPARQL generally treats graphs as referentially transparent.
> 
> that is the reading which i follow when implementing query processing under simple- and d-entailment.
> i understand neither where your notion of "quoting" originates not what relevance it has to sparql processing under those regimes.

In my first message in this thread I explained that I understand the RDF 1.1 WG's Note on datasets to discuss 4 dimensions of the problem of named graph semantics and one of them is if graphs are treated as referentially transparent or opaque. In other approaches like the original Named Graphs semantics by Carroll et al 2005 (or, slightly related, the proposed semantics for RDF-star quoted triples) they are indeed defined as referentially opaque, quoted. Now with respect to SPARQL it may seem like an easy to answer question or even a nonsensical question but still, to completely cover the 4 dimensions identified in the Note, I had to discuss it.

> some other entailment regime is free to specify how unification is to proceed and in that way to limit matching so as to effect quoting.
> some other form of dataset clause could well contributed to that variation.
> 
>> 
>>>> 4: it does answer this one, treating graphs as occurrences
>>>> (here I’m not entirely sure but as SPARQL does access graphs by name, not by their content…)
>>> 
>>> this is also not definitively answered, as it is described that repeated designators may or may not designate the same set of statements.
>> 
>> Variability over time is an issue that RDF in general doesn’t tackle. But in my intuition a name that refers to a type should really be stable whereas a name referring to an occurrence could be allowed more flexibility. So I would take this aspect as a slight confirmation of my provisional assessment that named graphs to SPARQL are occurrences.
> 
> for any designators related to a state which is not immutable, is this not by definition?

That seems like a good argument to me but I don’t know much about the theory behind immutable datastructures.

>>> that there are variants is not material to the principal issue, as is it is not one of interoperability but of ability.
>>> any given sparql implementation must have a complete answer to those questions or its operators will not close.
>> 
>> For an implementation not to terminate would indeed be a bad pre-condition but interoperability is what we strive for after all when defining standards, isn’t it?
>> 
>>>> So when ASKing, SPARQL defines most aspects of the semantics of named graphs but not what the graph name denotes.
>>> 
>>> how can a process which must be applied to a concrete collection statements which were designated by graph names do one without the other?
>> 
>> Which one and which other are you referring to?
> 
> any given sparql processor must "define[] ... what the graph name denotes."

In this section of my original amil I was just summing up the results. We discussed this issue already above and as you pointed out I had overlooked the definition in the SPARQL spec which does indeed render my concerns w.r.t. naming semantics mute.

My original argument however was that SPARQL only retrieves and its syntax unambiguously defines if an IRI is used to address a named graph. Maybe that isn’t true as for example there’s also SARQL update which I have not looked into yet. But if you’re interested in understanding my original reasoning, pre-reading the spec’s definition what the garph name denotes, read the following paragraph again:

>>>> Using the graph name in the FROM clause refers to the graph. Using it to annotate the graph however is shaky territory: nothing in SPARQL prevents me from naming my graph with triples from Paris with the URI <paris.com>. So I have no soundly defined way to annotate that graph, right?
>>> 
>>> which graph was that?
>> 
>> The graph that is addressable by that name in the dataset.
> 
> which graph was that?
> (this is not a rhetorical question. once you answer it

For example:

<paris.com>  {
    :me :goingTo <paris.com> ;
        :purpose :conference .
}

<paris.com> :created "53 b.c." , 
                     "just now" . 

It’s easy to query for the graph named <paris.com> because using the IRI in the FROM clause determines that it is used as a graph name. But it’s hard to annotate the graph so named because the IRI used to name it denotes two different things: the graph and the city (not to mention the website itself, but let’s refer to the Cool URIs document for that dscussion). But I take it now that not only basic princiuples of web architecture but also the SPARQL spec itself discourage such unsound naming practices. I probably wouldn’t argue about this seemingly obvious point at all if the Note didn’t discuss it (and Antoine Z. didn’t still employ examples where he reasons over graphs that have the same name but different sets of statements)

> , you will have the answer to your original question.)


>>> why does it matter that there is no universal relation between the designator for a graph and the set of statements which it contains?
>>> for a given implementation, there must be a necessary relation, but it need not be universal.
>> 
>> Within an application you are free to do whatever you want. If you want to exchange data on the web and integrate data from other sources in your application, your life becomes a lot easier if you don’t have to guess or read up the documentation or even the sourec code to figure out what naming semantics are employed in the data you are interested in. There must not be one universal relation. A default relation and a way to declare alternative relation types would be very helpfull.
> 
> that would be nice, but it is not material to the question which originated this thread, which related more to "possibility" that to "convenience".

You made that claim that the semantics of named graphs, while not standardized in the RDF specification, is de facto defined through the way SPARQL treats them - I hope that captures correctly what you said. I’m trying to check that claim by discussing it along those dimensions that I found the Note on dataset semantics discussing. 
If you are indeed right - and it seems you are - then that would be a powerfull argument to standardize Named Graphs semantics as referentially transparent occurrences denoted by their name. And that would not only be very useful in itself - finally a standard semantics for named graphs - but would also complement nicely the proposed semantics for RDF-star quoted triples (probably extendend to quoted graphs) and/or an RDF literal datatype as per Antoine Zimmermann's proposal.
So it’s more than just "nice" or "convenient", it would be really usefull and put an end to a long and exhausting debate.

>>>> Or can we exclude this possibility because it violates basic principles of web architecture w.r.t. URI collisions?
>>> 
>>> there is nothing in the recommendation which stipulates that the content must be the resource which would be retrieved given via http (or whatever) protocol were the designator to be treated as a location.
>>> there are implementations in which the graph designators bear no necessary relation to resources in the internet.
>> 
>> You seem to see that as the exception. I had hoped that it would be rather the norm.
>> 
>>> it is even permitted to designate a graph wth a blank node.
>> 
>> Yes, and that’s fully in line with what I deem reasonable. I would like the graph name to only name that graph and nothing else, not a date of ingestion, not a topic like Paris etc. As I just figured out above the SPARQL spec seems to agree.
> 
> yes, but you seem disappointed that, when you return to your environment at some later point or when you direct your (nominally simultaneous) attention to some other environment, you will face the reality, that the set of triples is not the same as they were "originally".

I don’t understand what you refer to: changing state, entailments, co-denotation, the wobbly condition introduced by OWA and NUNA? I’m not disappointed by that. I’m not a big fan of quoting at all as the primary goal of RDF is to facilitate data integration via a notion of its meaning, and quoting runs counter that purpose. I do understand however that some applications have to close the world locally and I support the desire to do this not only by out-of-band means but by adding mechanisms to RDF that make it easy to declare (sets of) statements as referentially opaque and "quoted".

>>> yes, rdf originated as a way to describe web resources.
>>> it has advanced well beyond that.
>> 
>> It is not the question if the resource so identified is a web resource. The question is if one identifier is allowed to refer to two different resources simultanously.
> 
> in space or in time?
> how do the target of sparql update clauses relate to this?

SPARQL update is on the list of topics I have to look into…

>> As the <paris.com> example shows this may work in one scenario - querying - but not in another - annotating. Generally it is discouraged in web architecture, and for good reason.
> 
> please elaborate on the distinction within the scope of a given query.

I hope my example above answered that.
 
> beyond that, the web provides a poor semblance of reality and will need to advance a long way before it will be able to dictate.

Tiny steps, one after the other…

> best regards, from berlin,
> 
> 


Thomas
Received on Tuesday, 21 September 2021 11:31:08 UTC