Summary of my proposal for using named graphs

My proposal is about building upon named graphs (all kinds; named by
IRIs or bnode id:s). It hinges upon two new formal rules for graph
store management.

Named graphs are the association of a graph to any kind of resource,
within a dataset. This association is undefined today, and could do
with a *weak* formal definition, along the lines of: A resource
"names" an RDF graph within the context of a given dataset. This means
that the graph is *some kind of* representation of the message of
meaning of this resource, within this dataset. It could be the graph
snapshot itself, or a very indirect form of association, such as a
document, event or even a person. The only requirement is that,
*within this dataset context*, this resource has no other meaningful
graph interpretation.

Then define *bound*, or *owned* named graphs as: a named graph which
is rdfx:boundBy another named graph, *within this dataset*. This is
done through an assertion of such a relationship in the default graph.
(Exact name of this relation to be discussed of course. I recently
called it rdfx:namedBy, sys:quoteFrom, and generally an "appendix"
relation. I think "bound" or "owned" is more strict.)

For these bound named graphs, define two rules for graph stores:
- A bound named graph is *not* asserted in the union default graph.
- A bound named graph is deleted when its binding named graph is
deleted. (This also happens when it is updated; i.e. *replaced*.)

This is only a minimal and *partial* control of scopes of graphs in
datasets. Notably, whether the named graph is opaque or transparent is
still dependent on the graph store implementation, and how entailment
is controlled there. (Assuming a graph with no or only simple
entailment counts as "opaque".) We could define rules for this too, if
required, such as:
- More means of controlling which graphs are in the union default graph;
- Which entailment regime is used for all or certains graphs.

These choices *could* be controlled by the description of a named
graph (via type or an rdfsx:entailmentRegime; cf.
sd:entailmentRegime). Such descriptions need to be in the default
graph, or in the binding named graph for bound graphs.

(I've also suggested extending this proposal to include an entailment
extension for deriving reified statements from named graphs. It is not
critical, and may only be required for existing RDF 1.1 systems (due
to their potential inability to control what goes into the union
default graph). It would be valuable to consider for clarifying the
relationship of these two techniques for triple provenance.)

I don't think this is too far from what the RDFn proposal achieves,
differing in not introducing quints, but reusing named graphs for
"naming triples" too. And multiple triples can of course belong to one
bound named graph.

Unlike Dydra's nested graphs, these "bound named graphs"  only relate
the "names", not the graphs themselves. So there is no *real* nesting,
only association (just as graphs are "flat triples" even when they
describe trees). So everything from N-quads to SPARQL works as before
(but strictly adhering to the above explicit rules for what goes into
the union default graph).

I've also proposed an annotation syntax as a shorthand for this, to
also allow for putting statements (asserted or "commented out") into
bound named graphs; which can be further described (for provenance, or
additional, informal qualification "marginalia" (nothing affecting
monotonicity)). This is "ergonomics", not central, but IMHO rather
critical for uptake and general comprehension.

(Notably, I also don't think the << s p o >> syntax is needed; and as
this proposal is all about "tokens", so that form doesn't really fit
with the design. It *could* be worked in one way or another, if
needed.)

Cheers,
Niklas

Received on Thursday, 16 November 2023 16:55:24 UTC