Re: Feedback on RDF Graphs: Conceptual Role and Practical Use Cases from David Booth on 2025-09-30 (semantic-web@w3.org from September 2025)

From: David Booth <david@dbooth.org>
Date: Tue, 30 Sep 2025 11:01:38 -0400
To: semantic-web@w3.org
Message-ID: <bf1ff320-2859-4dbb-9143-b7ba293a6951@dbooth.org>

Filip asks:

 > Would others here be interested in working together on a documenting of
 > such practices (perhaps as a Community Group note)? I’d be glad to help
 > contribute to that effort if there’s interest.

I would at least be interested in following such work, and possibly 
contributing a little.

David

On 9/30/25 08:14, Filip Kolarik wrote:
> On Tue, Sep 30, 2025 at 8:56 AM Pierre-Antoine Champin 
> <pierre-antoine@w3.org <mailto:pierre-antoine@w3.org>> wrote:
> 
>     __
> 
>     But still I rest my case about /existing /datasets in the wild:
> 
>     * In the absence of such metadata makes datasets inherently ambiguous.
>     * People are actually embracing this ambiguity by using named graphs
>     anyway they see fit, and we should not prevent them.
> 
>     And no, the WG has no immediate plan to standardize how this kind of
>     metadata could be expressed, but any suggestion or incubation work
>     in the RDF-Dev Community Group would be welcome ;-)
> 
> 
> Thank you for the inputs. I’ve shared a similar post elsewhere and the 
> most common response was simply to cite the generic definitions. What I 
> found most insightful here:
> 
>    * Graphs as semantic groupings; a way to group statements and 
> attribute them with an identifier or other metadata. Triple terms 
> provide similar functionality, but at a finer granularity.
>    * Graphs/Datasets as processing units; This distinction might help to 
> decide when to use graphs versus triple terms.
> 
> RDF gives us great expressivity, but this comes at a cost: its 
> generality and high-level definitions can easily overcomplicate 
> processing and add complexity to reasoning and understanding. This seems 
> somewhat at odds with the original vision of the Semantic Web, which is 
> to make data integration and reasoning easier, not harder.
> 
> There is unlikely to be a single “right” approach, but from what I see 
> there are distinct categories of use cases that would benefit from going 
> beyond the generic definition of a graph, toward clearer best practices 
> and shared conventions.
> 
> Some perspectives where the differences between graphs, named graphs, 
> and triple terms matter might include:
> 
>   * Processing
>      - Document-oriented: smaller, curated datasets, often self-contained.
>      - Big-data: large, heterogeneous datasets where partitioning and 
> provenance are critical.
>   * Provenance and trust
>      - Tracking the origin of statements (datasets from multiple 
> contributors, trust boundaries, licensing).
>      - Distinguishing between authoritative vs. third-party data.
>    * Data management
>      - Efficient partitioning and indexing for very large graphs.
>      - Isolation of subsets of data for domain-specific reasoning or 
> processing.
>    * Interoperability
>      - Metadata standards could help reduce ambiguity
> 
> Would others here be interested in working together on a documenting of 
> such practices (perhaps as a Community Group note)? I’d be glad to help 
> contribute to that effort if there’s interest.
> 
> Best,
> Filip
>

Received on Tuesday, 30 September 2025 15:01:42 UTC