Re: Summary of my proposal for using named graphs from Thomas Lörtsch on 2023-11-19 (public-rdf-star-wg@w3.org from November 2023)

From: Thomas Lörtsch <tl@rat.io>
Date: Sun, 19 Nov 2023 22:32:12 +0100
To: Niklas Lindström <lindstream@gmail.com>
Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-Id: <DA4D0A49-1355-4700-BC38-8ED01150EB56@rat.io>
Hi Niklas,

> On 16. Nov 2023, at 17:54, Niklas Lindström <lindstream@gmail.com> wrote:
> 
> My proposal is about building upon named graphs (all kinds; named by
> IRIs or bnode id:s). It hinges upon two new formal rules for graph
> store management.
> 
> Named graphs are the association of a graph to any kind of resource,
> within a dataset. This association is undefined today, and could do
> with a *weak* formal definition, along the lines of: A resource
> "names" an RDF graph within the context of a given dataset. This means
> that the graph is *some kind of* representation of the message of
> meaning of this resource, within this dataset. It could be the graph
> snapshot itself, or a very indirect form of association, such as a
> document, event or even a person. The only requirement is that,
> *within this dataset context*, this resource has no other meaningful
> graph interpretation.

I think that might be too strong already, at least judging from Andy’s repeatedly voiced concerns (OTOH I really wonder if in practice anybody cares). The NNG proposal is more targeted: the nesting properties define their range to be governed by certain naming semantics because that’s what’s needed when nesting them (one can nest graphs, but not anything else that the naming IRI may refer to). Also, any IRI in any statement can be annotated to refer to a graph if one wishes to unambiguously annotate a graph (and not something else that the IRI that names the graph also refers to).
So defining naming smeantics is targeting uses that require to disambiguate naming semantics, nothing else. Of course, one may also use the SPARQL dataset description vocabulary to define the naming semantics for the whole dataset once and for all, but that’s completely optional.

> Then define *bound*, or *owned* named graphs as: a named graph which
> is rdfx:boundBy another named graph, *within this dataset*. This is
> done through an assertion of such a relationship in the default graph.
> (Exact name of this relation to be discussed of course. I recently
> called it rdfx:namedBy, sys:quoteFrom, and generally an "appendix"
> relation. I think "bound" or "owned" is more strict.)
> 
> For these bound named graphs, define two rules for graph stores:
> - A bound named graph is *not* asserted in the union default graph.
> - A bound named graph is deleted when its binding named graph is
> deleted. (This also happens when it is updated; i.e. *replaced*.)

I assume you explained in an earlier mail why you want bound graphs to not be asserted in the default graph, and I missed it. Can you provide me a link, or give a short motivation?

> This is only a minimal and *partial* control of scopes of graphs in
> datasets. Notably, whether the named graph is opaque or transparent is
> still dependent on the graph store implementation, and how entailment
> is controlled there. (Assuming a graph with no or only simple
> entailment counts as "opaque".) We could define rules for this too, if
> required, such as:
> - More means of controlling which graphs are in the union default graph;
> - Which entailment regime is used for all or certains graphs.
> 
> These choices *could* be controlled by the description of a named
> graph (via type or an rdfsx:entailmentRegime; cf.
> sd:entailmentRegime). Such descriptions need to be in the default
> graph, or in the binding named graph for bound graphs.
> 
> (I've also suggested extending this proposal to include an entailment
> extension for deriving reified statements from named graphs. It is not
> critical, and may only be required for existing RDF 1.1 systems (due
> to their potential inability to control what goes into the union
> default graph). It would be valuable to consider for clarifying the
> relationship of these two techniques for triple provenance.)

So this is all conventions on graph naming and where to store "binding" information that controls visibility, opacity etc, right? That might be a tad "weak". I assume anything "stronger" will be implemented the same way (statements describing visibility and semantics, stored in specific locations), but not visible to the user.  Still, keeping such management infos out of the reach of users seems safer.

> I don't think this is too far from what the RDFn proposal achieves,
> differing in not introducing quints,

[ I learned from native speakers that it’s "quin", not "quint", although I liked the latter much better ]

> but reusing named graphs for
> "naming triples" too. And multiple triples can of course belong to one
> bound named graph.


> Unlike Dydra's nested graphs, these "bound named graphs"  only relate
> the "names", not the graphs themselves. So there is no *real* nesting,
> only association (just as graphs are "flat triples" even when they
> describe trees).

Nesting in Nested Named Graphs ( = Dydra’s approach) is a feature of a specific syntax that extends TriG. It can be expressed in n-quads just as well (and be mapped to n-triples). I’m not yet "officially" announcing it because we’re still working on stuff, but check out https://github.com/rat10/nng for the latest description of Nested Named Graphs.

> So everything from N-quads to SPARQL works as before
> (but strictly adhering to the above explicit rules for what goes into
> the union default graph).

Same for Nested Named Graphs. But we do introduce graph literals to make configurable semantics completely safe and backwards compatible. As I said above, behind the doors that can be implemented via special system graphs, just hidden to the user and therefore safer than just conventions. 

Also, the nesting syntax is very powerful, but needs some extra support in SPARQL to harness that power when querying. That’s what we are working on right now. 

> I've also proposed an annotation syntax as a shorthand for this, to
> also allow for putting statements (asserted or "commented out") into
> bound named graphs; which can be further described (for provenance, or
> additional, informal qualification "marginalia" (nothing affecting
> monotonicity)). This is "ergonomics", not central, but IMHO rather
> critical for uptake and general comprehension.

Sorry, but can you link again to where you described that?

(And yes, IMO too ergonomics is critical!)

> (Notably, I also don't think the << s p o >> syntax is needed; and as
> this proposal is all about "tokens", so that form doesn't really fit
> with the design. It *could* be worked in one way or another, if
> needed.)
> 
> Cheers,
> Niklas

Best,
Thomas
Received on Sunday, 19 November 2023 21:32:24 UTC