Re: Feedback on RDF Graphs: Conceptual Role and Practical Use Cases from Pierre-Antoine Champin on 2025-09-30 (semantic-web@w3.org from September 2025)

From: Pierre-Antoine Champin <pierre-antoine@w3.org>
Date: Tue, 30 Sep 2025 08:56:25 +0200
To: danbri@gmail.com
Cc: Filip Kolarik <filip26@gmail.com>, semantic-web@w3.org
Message-ID: <9b7fd87f-6ed4-4489-a04b-45d015dc262f@w3.org>
On 29/09/2025 16:54, Dan Brickley wrote:
>
> On Thu, 25 Sept 2025 at 08:20, Pierre-Antoine Champin 
> <pierre-antoine@w3.org> wrote:
>
>     Hi Filip,
>
>     On 19/09/2025 22:50, Filip Kolarik wrote:
>>     Dear Semantic Web Community,
>>     I’m seeking feedback on the conceptual and practical aspects of
>>     RDF graphs.
>>
>>     In RDF 1.2, an RDF graph is defined as: "An RDF graph is the
>>     conjunction (logical AND) of all the claims made by its asserted
>>     triples." This definition captures the logical aggregation of
>>     triples, but it leaves open questions about how graphs are used
>>     in practice.
>     Indeed. RDF Semantics is only defined for a given graph. How you
>     construct that graph (e.g. by picking and aggregating different
>     RDF resources based on your own criteria) is out of scope of the
>     specification -- even though that's also an bunch of interesting
>     questions :)
>>
>>     I would appreciate the community’s insights on questions such as:
>>       * How do you interpret the role of graphs?
>>       * Are graphs primarily conceptual constructs to organize
>>     triples, or are they treated as concrete, addressable units in
>>     practice?
>>       * Do you see named graphs as a way to scope statements, manage
>>     provenance, or isolate data for processing, while the “default
>>     graph” serves a different purpose?
>
>     To be clear: named graphs and datasets are defined
>     <https://www.w3.org/TR/rdf11-concepts/#section-dataset> in RDF's
>     abstract syntax, but are not covered by RDF Semantics. The reason
>     is that, back in 2014 when RDF 1.1 was specified, datasets were
>     already largely deployed, and used in many different ways
>     (including the ones you list above). The working group at the time
>     considered that it could not decide on a specific semantics for
>     datasets and named graphs without breaking many people's
>     implementations... That's the reason of this status quo.
>
>
> It doesn't seem a giant problem. We probably have enough experience 
> now to characterise 2, 3 or however many common patterns for using 
> named graphs in RDF applications and platforms.
Indeed, and that was also published by the RDF 1.1 WG (although non 
normatively): https://www.w3.org/TR/rdf11-datasets/
> Metadata about the way a particular repository manages and names its 
> graphs should be fairly straightforward to describe in RDF. Any 
> standardization could be funneled into that kind of descriptive role.
The WG has not (yet?) explored that path, but I agree.
>
>     So yes, you can use named graphs for all of these things, just
>     remember that this will not be broadly interoperable. In other
>     words, if you send your dataset to someone else, or if you make it
>     available via a SPARQL endpoint, you will need to provide
>     additional off-band knowledge explaining what the (custom)
>     semantics of your named graphs is. This may not be an issue in
>     some cases, but in it may be in others.
>
>     With RDF 1.2's triple terms, on the other hand, we have a way to
>     address all these use cases /explicitly/ in a single RDF graph:
>     you can describe triple terms (or sets thereof) with dedicated
>     vocabularies (for provenance, or confidence, etc.), and have this
>     knowledge included in your RDF graph, and available for reasoning.
>
>     It does not mean that named graphs will disappear -- most systems
>     using them today will probably continue to do so if that works for
>     them. But triple terms provide an alternative design options for
>     new systems (or for migrating some old ones).
>
>
> triple-terms sound like they address usecases at the level of a 
> particular triple, or perhaps a small bundle of related triples. Named 
> graphs can operate usefully with graphs populated by millions or 
> billions of triples. Is it realistic to use triple terms for the 
> latter too?

I believe it is. Let me explain (with my W3C hat still off -- this does 
not represent the WG's position, only my own ideas):

RDF 1.2 semantics could be used as a foundation for assigning a precise 
semantics to a dataset *if* we also have metadata clarifying the 
relationship between graph names and graphs. *Then* any quad

   S P O G . # in n-quads

could be seen as having the same semantics as

   G X <<( S P O )>> . # in n-triples 1.2

where X is a predicate depending on the metadata associated to the dataset.

I'm not suggesting that the n-triples serialization should be preferred 
to n-quads+metadata, nor that implementations should give stop 
representing quads natively and move to embedded triples as above. I see 
this as a conceptual mapping that would allow us to reason with these 
datasets that have the appropriate metadata.

But still I rest my case about /existing /datasets in the wild:
* In the absence of such metadata makes datasets inherently ambiguous.
* People are actually embracing this ambiguity by using named graphs 
anyway they see fit, and we should not prevent them.

And no, the WG has no immediate plan to standardize how this kind of 
metadata could be expressed, but any suggestion or incubation work in 
the RDF-Dev Community Group would be welcome ;-)

> Dan
>
>       pa
>
>     PS: this is only my personal position on the subject; this is//not
>     an official statement from the Working Group
>
>
>>       * How do you decide when to create separate graphs versus
>>     keeping data in a single graph?
>>       * In your experience, does the choice of graph boundaries
>>     affect reasoning, querying, or data integration in practical
>>     applications? For instance, do you treat multiple graphs as
>>     separate units, or are there scenarios where it’s helpful to
>>     merge graphs and process a subject’s properties across them?
>>
>>     Any references, examples, or experiences you can share would be
>>     extremely valuable in understanding the balance between the
>>     conceptual model and its practical applications.
>>
>>     Thank you for your time and expertise.
>>
>>     Best regards,
>>     Filip
>>     https://www.linkedin.com/in/filipkolarik/
>
Received on Tuesday, 30 September 2025 06:56:27 UTC