Re: expanding work from quoted triples to graph terms

> On 22. Oct 2023, at 02:42, Gregg Kellogg <gregg@greggkellogg.net> wrote:
> 
>> On Oct 21, 2023, at 10:20 AM, Andy Seaborne <andy@apache.org> wrote:
>> 
>> An extra column could be the effect on the RDF Data Model.
>> https://www.w3.org/TR/rdf12-concepts/#section-rdf-graph
>> 
>> What new RDF terms are there?
>> 
>> ---
>> 
>> Based on the strawpoll last week, can we settle on "graph term" and "occurrence"?
>> 
>> Are there any other conceptual items?
>> 
>> "occurrence" has been used in CG and WG discussions.
> 
> This also seems to be the same as “token” based on the type/token discussion. I think “occurrence” has a more intuitive meaning.

I fear I’m at least partially guilty of introducing the term "occurrence", because I’ve been accustomed to it in the Topic Maps world long long ago. Actually I do find the term "token" a better fit: more customary (also in the type-token dichotomy) and easier to spell and type. The subtle semantic differences between the two that I’m sure plato.stanford.edu knows about hopefully shouldn’t concern us. "Token" is also the term used in RDF 1.1 Semantics [https://www.w3.org/TR/rdf11-mt/#reification], so using it furtheron might avoid unnecessary confusion.

>> A graph term is a graph used as a RDF term in the RDF data model. It's quoted triple (triple term) but for graphs. (It has value-equality (structural equality)).
>> 
>> Using this terminology is not implying any particular choice of semantics.

Yes, not with respect to referential transparency/opacity.
However, your wording is a bit imprecise, because the question of type or token/occurrence is also a question of semantics, just an orthogonal one (not nitpicking here, just trying to avoid confusion).

>> There is work-in-progress going on
>> https://github.com/w3c/rdf-concepts/pull/67
>> 
>> so bringing that PR conversation together with the "options" would be a way forward.
> 
> There are definitely a couple of views on what this means. In the “Blank Graphs” interpretation, the data model is pretty unaffected with a dataset containing a default graph and zero or more named graphs. In this case, the graph name is a blank node which is also used as the subject or object of another triple (considering the limitations on graphs not referencing themselves). We could consider a topology based on tracking down these references, similar to how SPARQL deals with property paths.
> 
> In the other pure Graph Term model, graphs are, themselves, resources. A graph may be the subject or object of a triple, and the set of such graph terms is disjoint from the set of named graphs. BGP matching works similar to N3 with variable substitutions, but elements from arbitrary graph terms can’t be selected without introducing something like a property path mechanism. In the abstract syntax graph terms do not have identifiers (similar to blank nodes), but they are required for most concrete syntax usage.
> 
> The notion of Graph Isomorphism needs to be extended for any such definition and is distinct from graph equality. Two triples might contain the same graph term or triples might contain graph terms which are not the same, but are isomorphic.

> IMO, Thomas’s Nested Graph proposal seems overly reliant on concrete syntax and I would like to see this more formalized as an extension to the abstract syntax; I’m nor sure how difference in transparency are reflected in the abstract syntax.

The proposal provides a lot of description (yes, a bit too much) and some Turtle-ish examples. The text alone, although of course abstract, should explain everything. Can you describe more concretely what you would like to see w.r.t. abstract syntax? One idea that I get from what you write above is that you would like to know if nested graphs need a representation in the abstract syntax. I think they don’t. They are named and that name is an IRI or a blank node. The statements they contain are regular statements and part of the target graph. What else might be needed?

Concerning transparency: the proposal assumes referential transparency in most places - nested graphs, unasserted graphs in {"…"} shorthand syntax - just like regular RDF, so in that respect I assume that no change to the abstract syntax is needed.

To accomodate for any further needs of more specificc semantics we propose to introduce RDF graph literals, which can then be transcluded with any semantics one desires (and defines, and implements...). The proposal sketches a set of possible semantic characteristics and useful combinations thereof, but all that is just to demonstrate how the proposal enables more specific semantics. They are not developed in detail, just as there also is no imlpementation for such semantics provided. But the proposal hopes that on this basis one could implement Notation3 or BLogic or local Closed World Environments etc with nested graphs.

>> ---
>> 
>> My understanding at the moment is that the "blank graph" variants are compatible with the graph component of a named graph pair [*] being a graph term.
>> 
>> 
>> "Blank graph" variants:
>> _:a { :s :p :o }
>> 
>> { :s :p :o } is a graph term, _:a is a resource for the occurrence.
>> 
>> "Graph terms"
>> _:a rdf:occurrenceOf{ :s :p :o }
>> 
>> { :s :p :o } is a graph term, _:a is a resource for the occurrence.
> 
> One thing in favor of the Blank Graph proposal is it seeming has a limited impact on the data model, although it requires some re-interpretation of what named graphs identified by blank nodes means. It’s also the view that specs based on JSON-LD have taken (e.g., Verifiable Credentials).

I’m not a fan of that approach. I fear it’s a hack, and it won’t hold. To me the trouble starts with 
ex:A rdf:occurrenceOf{ :s :p :o }
Also blank nodes can be skolemized, what then? 

However, I do confess that at one point I realized that I provide no way to refer to the type. Now I'm thinking about using graph literals for that purpose. They are as immutable as it gets. Multiple references to a literal are clearly owl:sameAs. Of course they need to be queryable, but that is doable and there are more use cases than this one, so the effort is well-founded.

So, e.g.

":a :b :c"^^rdf:ttl :occured :Yesterday .

describes something ABOUT the type. That about-ness, which shouldn’t be allowed to touch/change/manipulate the graph ITSELF, is quite well reflected by refering to a representation of the type, a literal. The nested graph proposal defines syntactic sugar for unasserted assertions, referentially transparent like RDF standard reification:

{" :a :b :c "} :occured :Yesterday .

The nested graph proposal imagined that as syntactic sugar for standard reification (which is defined on tokens), but the nested graph approach is already tokens everywhere. It might make much more sense to use that syntax to represent types. That they are not asserted might just be a feature: what better way to make sure that you are talking about a type, an abstract thing, then by keeping it separate from the statements that are actually asserted!

In the end however we have to remeber that we are talking about authoring here, describing a type, asserting something to be common to all tokens/occurrences of that type. However, in usage - naviagting, querying - what is a type and what is a token is very much in the eye of the beholder. If you’re not interested in the annotations to a nested graph then the graph token *is* your type, no matter what the definitions say. If suddenly one of those annotations gets into focus, you have another (more specific) type. 
You may annotate a graph type with the number of times it occurred in some dataset. But I may differentiate the case when the graph type occurred this number of times from another case where it occurred that number of times. Your type becomes my token… 
In a way these are shifting sands. The nested graph proposal argues for late binding semantics as far as possible: the information desire, i.e. the query, establishes the difference between the universal core (the type) and the variants (the tokens), every query anew.

> Gregg
> 
>>   Andy
>> 
>> [*] A named graph is a pair (resource reference, graph)
>> https://www.w3.org/TR/rdf11-concepts/#dfn-named-graph

and while RDF leaves it open for named graphs what the name means, in the nested graph proposal the name of a nested graph names the pair name+graph. I think that’s the way to go. The intuition behind it is: you don’t create and name that graph without a reason. So it has a meaning, even if there are no annotations on it and the name itself is completely random. Of course that might seem a little extreme, but in almost all cases there *will* be annotations or the name itself will carry meaning - and the last thing you want is the naming semantics to change as soon as the first annotation is made.

Thomas



>> On 20/10/2023 17:17, Niklas Lindström wrote:
>>> Dear all,
>>> I picked up the suggestion in the telecon and have drafted an overview
>>> of the options (and proposals) that (AFAIK) are on the table ("RDF
>>> options for triples about triples"). Right now it's in a Google
>>> Spreadsheet at:
>>> https://docs.google.com/spreadsheets/d/1pzA5AYkzEO-Mr6ClV4KNjUf4bAsrCz_ZWS9dMEFgh1o/edit?usp=sharing
>>> I can move this to our wiki [1] if that's preferable. (I think so, but
>>> it demands a bit more to edit it as a markdown table. Otherwise I can
>>> grant everyone edit rights one by one, unless we already have a shared
>>> Google Docs folder I've missed?)
>>> I'm trying to single out features from these, to simplify assessments.
>>> I've made some footnotes and questions in the sheet for starters.
>>> If anyone wants to have a call hashing out these details, I'm all for
>>> it. (Perhaps we could use next week's cancelled Semantics TF timeslot
>>> for that? It depends on where we are after the regular call of
>>> course.)
>>> All the best,
>>> Niklas
>>> [1]: https://github.com/w3c/rdf-star-wg/wiki
>> 
> 
> 

Received on Sunday, 22 October 2023 10:42:51 UTC