- From: David Booth <david@dbooth.org>
- Date: Tue, 30 Sep 2025 11:01:38 -0400
- To: semantic-web@w3.org
Filip asks: > Would others here be interested in working together on a documenting of > such practices (perhaps as a Community Group note)? I’d be glad to help > contribute to that effort if there’s interest. I would at least be interested in following such work, and possibly contributing a little. David On 9/30/25 08:14, Filip Kolarik wrote: > On Tue, Sep 30, 2025 at 8:56 AM Pierre-Antoine Champin > <pierre-antoine@w3.org <mailto:pierre-antoine@w3.org>> wrote: > > __ > > But still I rest my case about /existing /datasets in the wild: > > * In the absence of such metadata makes datasets inherently ambiguous. > * People are actually embracing this ambiguity by using named graphs > anyway they see fit, and we should not prevent them. > > And no, the WG has no immediate plan to standardize how this kind of > metadata could be expressed, but any suggestion or incubation work > in the RDF-Dev Community Group would be welcome ;-) > > > Thank you for the inputs. I’ve shared a similar post elsewhere and the > most common response was simply to cite the generic definitions. What I > found most insightful here: > > * Graphs as semantic groupings; a way to group statements and > attribute them with an identifier or other metadata. Triple terms > provide similar functionality, but at a finer granularity. > * Graphs/Datasets as processing units; This distinction might help to > decide when to use graphs versus triple terms. > > RDF gives us great expressivity, but this comes at a cost: its > generality and high-level definitions can easily overcomplicate > processing and add complexity to reasoning and understanding. This seems > somewhat at odds with the original vision of the Semantic Web, which is > to make data integration and reasoning easier, not harder. > > There is unlikely to be a single “right” approach, but from what I see > there are distinct categories of use cases that would benefit from going > beyond the generic definition of a graph, toward clearer best practices > and shared conventions. > > Some perspectives where the differences between graphs, named graphs, > and triple terms matter might include: > > * Processing > - Document-oriented: smaller, curated datasets, often self-contained. > - Big-data: large, heterogeneous datasets where partitioning and > provenance are critical. > * Provenance and trust > - Tracking the origin of statements (datasets from multiple > contributors, trust boundaries, licensing). > - Distinguishing between authoritative vs. third-party data. > * Data management > - Efficient partitioning and indexing for very large graphs. > - Isolation of subsets of data for domain-specific reasoning or > processing. > * Interoperability > - Metadata standards could help reduce ambiguity > > Would others here be interested in working together on a documenting of > such practices (perhaps as a Community Group note)? I’d be glad to help > contribute to that effort if there’s interest. > > Best, > Filip >
Received on Tuesday, 30 September 2025 15:01:42 UTC