- From: Niklas Lindström <lindstream@gmail.com>
- Date: Sun, 12 Nov 2023 16:00:57 +0100
- To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
- Cc: public-rdf-star-wg@w3.org
On Sun, Nov 12, 2023 at 1:27 PM Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote: > > As far as I can tell, although it is well disguised, option 2 is the way the > working group was progressing. This option would probably be very close or > identical to existing RDF-star implementations. > > In my opinion for the working group to take up any other option requires > either finishing option 2 or making a determination that quoted triples as in > https://www.w3.org/TR/2023/WD-rdf12-concepts-20231013/#section-triples are > fundamentally flawed. It is certainly a fundamental change to the 24 year old substrate and abstract syntax of RDF. It has been presented as "better reification", while at the same time, by design, breaking from what reification is explicitly defined to cater for [1]: The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object. This supports use cases where properties such as dates of composition or provenance information are applied to the reified triple, which are meaningful only when thought of as referring to a particular instance or token of a triple. Significantly, this break from that design has also made examples and use cases suffer ("the seminal example" problem [2]), in that simple usage works as is, but has to introduce some kind of relation to a token occurrence (commonly using a custom property and a blank node). Yet there is no explanation on why reification is a worse design, nor how the two are supposed to be used in conjunction. The proposed design also uses opacity by default, but does not in any way relate that to the various uses of named graphs, which can -- albeit not normatively so (since they have no defined semantics) -- be used for this same purpose of opaque quotation. There are many forms of quotation [3], which in being a paradigmatic opaque context is reasonably more related to sets of triples, i.e. graphs, and combined usage thereof. Unless graphs are now supposed to be explicitly declared sets of triple terms? That is certainly not part of any proposal (probably for a good reason: we already have graphs). I would like a clarification of what "victory" means here. Victory for the CG report? For the implementers of that? For RDF library and tool maintainers in general? For the users of RDF past, present and future? We have these two existing options, reification and named graphs, which all use cases seem to be able to utilize. But who could do with better syntax, and more clarification in the specs. We have an obligation (at least to the wider RDF community) to see if this is a more reasonable path than adding something complex to the core of RDF. And if it is not, we must clarify *why* not. I've spoken to several people, both new and seasoned users of RDF, in academia and in various production environments (many who also work with training people in using RDF). I have only heard concerns that adding a new triple term, being a recursively defined structure, makes RDF more complex. Harder to understand, harder to develop best practices for. Conversely and, crucially, I haven't heard anyone saying that they make their use cases simpler. In certain cases (e.g. those only supporting the "PG form") the *syntax* of RDF-star annotations (as asserted plus quoted or reified) has looked like a promising way to do granular provenance, striking a balance between cumbersome reification (unless in the otherwise cumbersome RDF/XML form) and coarse-grained named graphs. We already know that this syntax too can easily be mapped to any (or both) of those options. That something has been implemented is not a strong argument (and if it was hard to implement, that is a potential case against it). Of the implementations of RDF-star some appear partial, e.g. only for the PG form (which at least in the AllegroGraph case also implied using quad multisets [4]). And there are many more cases where it has not been implemented at all (the complete set of libraries, tools and installations of the past 20+ years). It is a lot easier to add new things than to build and improve upon what is there. It is an entirely other thing to work with that for decades. Of course I am aware that my own ingrained habits, expectations and assumptions fundamentally affect my beliefs, and thus my comprehension. I am open to the epiphany that "triples all the way down" is the mathematically most pure, simple and effective design for all known use cases, and that it somehow actually makes it simpler to understand RDF than if we e.g. keep clinging to named graphs, or draw circles around "the triple itself" with reification, if it is the triple itself that is needed (and not, I must stress, the triple in *some named graph*, for that quality it does not have without dragging it down into token space). I have not yet had that epiphany, and I'm still looking for guidance towards that. For instance, are there any use cases that triple terms enable that are entirely novel? Please clarify which ones, and submit them to the collected use cases of the working group [5]. It is the collected cases that we must measure against. (I haven't seen any belief system cases such as "<Mary> :believes << <Jane> :said << <Bob> :knows <Jane> >> >>" there, so if anyone is doing substantive work with such data in RDF, please add that there!) Regards, Niklas [1]: https://www.w3.org/TR/rdf11-mt/#reification [2]: https://w3c.github.io/rdf-star/cg-spec/2021-12-17.html#the-seminal-example [3]: https://plato.stanford.edu/entries/quotation/ [4]: https://lists.w3.org/Archives/Public/public-rdf-star/2020Aug/0021.html [5]: https://github.com/w3c/rdf-ucr/ > > peter > > > On 11/12/23 03:41, Sasaki, Felix wrote: > > Hi Adrian, Gregg and all, > > > > Adrian, thanks a lot for the summary. As somebody relatively new to the > > working group and not attending the last meeting, I am struggling to > > understand the impact of the options. > > > > How is this topic related to RDF star? How would it influence the role of > > existing RDF star implementations > > > > https://w3c.github.io/rdf-star/implementations.html > > <https://w3c.github.io/rdf-star/implementations.html> > > > > Best, > > > > > > Felix > > > > *Von: *Gregg Kellogg <gregg@greggkellogg.net> > > *Datum: *Samstag, 11. November 2023 um 21:23 > > *An: *Adrian Gschwend <adrian.gschwend@zazuko.com> > > *Cc: *public-rdf-star-wg@w3.org <public-rdf-star-wg@w3.org> > > *Betreff: *Re: Next weeks discussions and decision-making for RDF Star WG > > > > > > > > Sie erhalten nicht oft eine E-Mail von gregg@greggkellogg.net. Erfahren Sie, > > warum dies wichtig ist <https://aka.ms/LearnAboutSenderIdentification> > > > > > > > > I think it’s great to focus on resolving this fundamental issue. Below are the > > outlined suggestions with some additional thoughts: > > > > 1) Do nothing beyond RDF 1.1, there’s already a reification vocabulary with > > native support in RDF/XML > > > > 1.1) Same as above, but add syntactic sugar to Turtle/TriG/SPARQL for > > expressing reified statements. This would most naturally involve using a blank > > node subject, rather than a fragment identifier. Something based on the > > current quotedTriple syntax ‘<<‘ qtSubject predicate qtObject ‘>>’ could be > > syntactic sugar for [ a rdf:Statement; rdf:subject qtSubject; rdf:predicate > > predicate; rdf:object qtObject ]. (Stable identification can be addressed via > > indirection such as <#frag> rdfx:instanceOf <:a :b :c >). > > > > 2) Declare victory using the current tripleTerm resource. Triples are types, > > and something like rdfx:instanceOf can be used to derive tokens. > > > > 3) Leverage RDF 1.1 named graphs with the provision that a blank node graph > > name used elsewhere as a subject or object “identifies" that graph so named > > (with some work on what “identifies” means in this context). This is > > effectively how JSON-LD is used in Verifiable Credentials and elsewhere. A > > graph inclusion hierarchy, if required, can be derived by following the path > > from subject/object to graph name/graph. > > > > 4) Create a graphTerm resource where graphTerms are first-class terms and are > > distinct from named graphs. This is probably closest to how Notation3 uses > > graphs. This is arguably the purist from a logic point of view, but may be > > more difficult to express in abstract and concrete syntaxes. > > > > Frankly, I think any of the single triple use cases can be expressed using > > either of these paradigms; collections of triples require one of the > > graph-based solutions. The main issue that gets in the way of settling on this > > is the Type/Token debate. I think this can be resolved in other ways. This > > doesn’t attempt to consider transparency/opacity; for blank nodes, I think > > opacity can be solved at the syntax level, by using identifiers that don’t > > overlap. > > > > Consider the URL http://xmlns.com/foaf/0.1/Person > > <http://xmlns.com/foaf/0.1/Person>. It could be considered to denote an RDF > > document containing a vocabulary definition for foaf:Person. It can be > > considered to be both a type and a token, depending on how it is used. In the > > context of the vocabulary definition, it is a token against which other > > properties can be defined: > > > > foaf:Person a rdfs:Class; rdfs:label “Person”; … > > > > In another context, it is a type: > > > > <http://rdfweb.org/people/danbri <http://rdfweb.org/people/danbri>> a > > foaf:Person; foaf:name "Dan Brickley” ... > > > > Similarly, a graphTerm could be considered to be a type or a token, depending > > on the context in which it is used. Something like {:a :b :c} a rdf:Graph has > > the characteristic of a type, while {:a :b :c} ex:containedIn > > <http://example.com/foo <http://example.com/foo>> has the characteristic of a > > token. We can leverage rdfs:range/domain and explicit type declarations to > > clarify the intended meaning. > > > > Perhaps we can use other explicit or implicit typing to clarify the use cases > > about when we are identifying a specific statement within a graphTerm or the > > graph itself as a collection of statements. > > > > Regarding the different possibilities outlined above: RDF is a system for > > describing graphs/datasets composed of triples/statements. IMHO, the > > fundamental building block should be a graph, so I favor either leveraging > > named graphs or adding a top-level graphTerm (options 3 and 4 above). I think > > the impact on implementations, such as quad stores, favors reusing and > > refining the RDF 1.1 concept of named graphs, but with nuance given to graphs > > named by blank nodes. This also works as is with N-Quads. Representing > > graphTerms natively requires some form of syntactic extension (either embedded > > graphs, or a new space for graph identifiers) as well as defining a graphTerm > > similar to how we’ve already defined a tripleTerm in the abstract algebra. > > > > If the WG is not able to take on the work for describing such use of named > > graphs, then I would favor doing something more like 1.1: reuse the existing > > reification vocabulary with syntactic sugar from the quotedTriple production > > of Turtle/TriG and SPARQL rather than adding a new tripleTerm which could > > interfere with future groups to take on the work of better describing the use > > of graphs as resources. But, I’m happy to go along with the consensus of the > > group whatever we decide. > > > > Gregg Kellogg > > gregg@greggkellogg.net > > > > > > > > On Nov 10, 2023, at 8:42 PM, Adrian Gschwend <adrian.gschwend@zazuko.com> > > wrote: > > > > Dear all, > > > > Following our last meeting and discussions on the various proposals, it > > has become clear that we need to focus our efforts on choosing a specific > > direction for our next steps. To facilitate our decision-making process, > > we are asking all members to review the proposals in detail and consider > > which one they currently favor. This preliminary decision will help to > > make the discussion at our next meetings more structured and productive. > > > > The next meeting is on November 16, as discussed for once we start at the > > normal time but stay one hour longer. Please see the calendar for details, > > the event is updated. > > > > regards > > > > Ora & Adrian > > -- > > Adrian Gschwend > > CEO Zazuko GmbH, Biel, Switzerland > > > > Phone +41 32 510 60 31 > > Email adrian.gschwend@zazuko.com > > >
Received on Sunday, 12 November 2023 15:01:58 UTC