- From: Olaf Hartig <olaf.hartig@liu.se>
- Date: Tue, 28 Nov 2023 08:15:44 +0000
- To: "public-rdf-star-wg@w3.org" <public-rdf-star-wg@w3.org>, "souripriya.das@oracle.com" <souripriya.das@oracle.com>
Hi Souri, On Mon, 2023-11-27 at 18:19 +0000, Souripriya Das wrote: > Hi Olaf, > > Thanks for your comments. Let me take one comment at a time, just to > make sure that we are on the same page before moving to the next > comment. Makes sense :-) > I am not sure if you noticed the '+ explicit naming ...' in the > following. All I was trying to say, talking as a practitioner, is > that if you extend RDF-star by adding ("+") the idea of "explicit > naming (using IRIs as custom names)", you can arrive at RDFn. [...] While I saw the "+ explicit naming ..." part of that bullet point, due to the parenthesis in between, I did indeed *not* notice that this was meant to be a term of the equation formula in that bullet point. Thanks for the clarification! > > > > 1) RDFn = RDF-star (which, I think, uses implicit naming in > > > > some sense, with << s p o >> as the name) + explicit naming > > > > (using IRIs as custom names). > > Please let me know how you feel about the above statement (and > whether it is simple enough for a practitioner to get the basic > idea). If we agree on this, we can move to your other comments. Now that I understood the complete equation illustrated by this bullet point, I agree that a practitioner may get the idea from it. (Yet, I would suggest you remove the first parenthesis because it is distracting.) Having said that, I still think the (complete) equation in this bullet point is still incorrect in terms of the details. As you may remember, regarding the type/token distinction, quoted triples in RDF-star are considered as types, not as tokens. In contrast, RDFn is about tokens [2]. As a consequence, one can talk about types (of triples) in RDF- star but not in RDFn. Therefore, the equation RDFn = RDF-star + explicit naming cannot be right. Best, Olaf [1] https://plato.stanford.edu/entries/types-tokens/ [2] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Oct/0106.html > Thanks, > Souri. > From: Olaf Hartig <olaf.hartig@liu.se> > Sent: Monday, November 27, 2023 7:40 AM > To: tl@rat.io <tl@rat.io>; public-rdf-star-wg@w3.org < > public-rdf-star-wg@w3.org>; Souripriya Das <souripriya.das@oracle.com > > > Subject: [External] : Re: An outline of RDFn -- RDF with (auto- and > custom-) names > > Hi Thomas, > > How do you know that RDFn is about tokens? I have not seen Souri > making > any explicit statements in this direction. > > Also, it is not correct to say that "both approaches add a fifth > element to the subject, predicate, object and graph that we already > have." RDF-star does not add a fifth element. Strictly speaking, > RDF- > star does not even have "graph" as a fourth element--there is no > notion > of a quad in the abstract syntax of RDF-star (and neither is there > any > such notion in the abstract syntax of RDF). Instead, RDF-star is > about > i) triples (which may be nested), > ii) graphs as sets of such triples, and > iii) datasets as collections of (IRI/bnode, graph) pairs, with an > additional graph called the default graph. > That is all there is in RDF-star. Adding "a fifth element" (as RDFn > seems to do) requires extending the abstract syntax with additional > concepts, and that's why "RDFn = RDF-star" is not true. > > Olaf > > > On Mon, 2023-11-27 at 11:45 +0100, Thomas Lörtsch wrote: > > Olaf, > > > > you should acknowledge that RDF-star is only defined on types of > > statements, but actual use cases in there overwhelming majority > > (including the "seminal example" that you wrongly used in your > papers > > on RDF*) work on tokens. A mechanism to define a reference to such > a > > token is mentioned in the RDF-star CG report only in the most > > informal way possible. > > > > Ergo the RDF-star formalization is irrelevant in practice and for > > actual practical applications using some derivate of > > ':occurrenceOf' no formalization exists, not even an informal > > standard vocabulary - years after the problem has been pointed out > to > > the CG. In that respect RDFn is definitely one or two steps ahead > of > > RDF-star. > > > > Also, I think Souri is right in another way: RDFn provides quins > out > > of the box, RDF-star handwavingly resorts to out-of-band means to > > define a token identifier, but both approaches add a fifth element > to > > the subject, predicate, object and graph that we already have. > > However, that is a problem with both RDFn and RDF-star, that a > named > > graph based approach can avoid, to considerable benefit of > > implementors as well as users. > > > > Thomas > > > > Am 27. November 2023 10:34:45 MEZ schrieb Olaf Hartig < > > olaf.hartig@liu.se>: > > > Hi Souri, > > > > > > I don't think your claim that "RDFn = RDF-star" is true (assuming > > > "=" > > > means something like: is the same as). > > > > > > In your previous email you introduce the notion of an "RDFn > > > statement" > > > about which you say the following. > > > > > > """ > > > An RDFn statement is uniquely identified using the tuple <s, p, > o, > > > g, > > > n>, where the component n is the "name" of the statement. (The > > > components s, p, and o represent the subject, predicate, and > > > object, > > > respectively. The component g, representing graph name, is non- > NULL > > > only for quads and will not be used in the examples below.) > > > """ > > > > > > First of all, notice that you are not explicitly saying what an > > > RDFn > > > statement actually is; you are only saying how it is uniquely > > > identified. Moreover, you do not specify what kind of a thing > this > > > "component n" is (neither do you explicitly say what kinds of > > > things > > > the components s, p, o, and g are, respectively). Also, I wonder > > > how > > > the notion of "is uniquely identified" would be captured > explicitly > > > as > > > an extension of the abstract syntax of RDF (or are you proposing > to > > > change the abstract syntax such that it is based on such 5-tuples > > > rather than RDF triples??). > > > > > > Now, regarding your claim, your notion of an RDFn statement is > not > > > a > > > concept of RDF-star [1]. Also, among the concepts of RDF-star, > > > there is > > > no such thing as what you informally call ''the "name" of the > > > statement,'' and neither is there any notion of NULL in RDF-star > > > (whereas you seem to assume such a notion for RDFn). > > > > > > Best, > > > Olaf > > > > > > [1] > > > > https://urldefense.com/v3/__https://www.w3.org/2021/12/rdf-star.html*concepts__;Iw!!ACWV5N9M2RV99hQ!Ibiq7odY3h_LSW8OEJGy61ig9MRR2G6pwS6Mr2qFRQ5vo5AYBGNwIBXLX_gvfLHTMh3uVgMmkSczC6klziFHsdCP$ > > > > > > > > > > On Mon, 2023-11-27 at 03:08 +0000, Souripriya Das wrote: > > > > Since I did not hear any comments on RDFn during the first half > > > > of > > > > our last meeting that I was able to attend (except, maybe, > Gregg > > > > might have said something right at the beginning but I had > audio > > > > issues on my side), I thought it may be helpful to mention > below > > > > a > > > > few high-level points about RDFn and how it is related to RDF- > > > > star > > > > concepts and syntax: ("statement" here simply means "a triple > or > > > > quad"): > > > > > > > > 1) RDFn = RDF-star (which, I think, uses implicit naming in > some > > > > sense, with << s p o >> as the name) + explicit naming (using > > > > IRIs as > > > > custom names). > > > > > > > > 2) RDFn (with appropriate syntactic shortcut) would appear > > > > exactly > > > > the same as RDF-star to a user who does not use multi-edges or > > > > statement-sets. > > > > > > > > 3) RDFn does not change anything regarding how users work with > > > > default graph and named graphs today. > > > > > > > > 4) RDFn requires use of explicit naming if user needs to store > > > > multi- > > > > edges. For modeling multi-edges, user does not need to > introduce > > > > new > > > > triples or quads with special properties like :isOccurrenceOf > or > > > > :hasOccurrence. > > > > > > > > 5) RDFn requires use of explicit naming for modeling statement- > > > > sets > > > > as well. A statement-set in RDFn can include (asserted or > > > > unasserted) > > > > triples from the default graph and the named graphs. The > custom- > > > > name > > > > of a statement-set can be used for making statements about it. > > > > > > > > Thanks, > > > > Souri. > > > > From: Souripriya Das <souripriya.das@oracle.com> > > > > Sent: Wednesday, November 15, 2023 9:39 PM > > > > To: RDF-star WG <public-rdf-star-wg@w3.org> > > > > Subject: [External] : An outline of RDFn -- RDF with (auto- and > > > > custom-) names > > > > > > > > As the group tries to decide on options, the following outline > of > > > > a > > > > revised version of RDFn may be useful for discussions. > > > > > > > > Core concepts and ideas in RDFn: > > > > An RDFn statement is uniquely identified using the tuple <s, p, > > > > o, g, > > > > n>, where the component n is the "name" of the statement. (The > > > > components s, p, and o represent the subject, predicate, and > > > > object, > > > > respectively. The component g, representing graph name, is non- > > > > NULL > > > > only for quads and will not be used in the examples below.) > > > > Example 1: An RDFn statement, with ex:jSm as its name, > > > > representing > > > > the tuple <ex:john, ex:spouseOf, ex:mary, null, ex:jSm>: > > > > --> ex:john ex:spouseOf ex:mary | ex:jSm . > > > > Based on how its name was created, a statement can belong to > one > > > > of > > > > two possible types: > > > > auto-named: The name n for an auto-named statement <s, p, o, g, > > > > n> is > > > > computed as rdfnAuto:foo(s, p, o, g), where > > > > rdfnAuto is an exclusive namespace used only for names used for > > > > auto- > > > > named statements, and > > > > foo is an implementation-specific function that generates > unique > > > > string from the <s, p, o, g> portion of the statement, > > > > custom-named: The name of a custom-named statement is an IRI > that > > > > is > > > > supplied by the data creator. (The IRI cannot have rdfnAuto as > > > > its > > > > namespace prefix.) > > > > The name of a statement may be used as subject or object of > other > > > > statements as long as there is no direct or indirect self- > > > > recursion > > > > involving the name (e.g., <n, p, o, g, n> is not allowed > because > > > > n > > > > has to be computed using n). > > > > Example 2: Adding statements about an auto-named statement > (using > > > > placeholder for the auto-generated name): > > > > --> ex:Cleveland ex:servedAs ex:POTUS | rdfnAuto:term1 . > > > > --> rdfnAuto:term1 ex:startYear 1885 ; ex:endYear 1889 . > > > > Example 3: Adding statements about a custom-named statement: > > > > --> ex:Cleveland ex:servedAs ex:POTUS | ex:term2 . > > > > --> ex:term2 ex:startYear 1893 ; ex:endYear 1897 . > > > > Core concepts and ideas in SPARQLn: > > > > A new filter isAuto(<name>) is introduced to allow > distinguishing > > > > between auto-named and custom-named statements. If this filter > is > > > > not > > > > used, all statements will qualify, regardless whether auto- > named > > > > or > > > > custom-named, provided they match regular SPARQL criteria. > > > > Example 4: The following query returns the ?cnt = 2 if the data > > > > about > > > > President Cleveland's both terms (from Example 2 and Example 3 > > > > above) > > > > are present in the RDF dataset: > > > > --> SELECT (count(*) as ?cnt) { ?s ex:servedAs ex:POTUS } > > > > Example 5: The following query returns ?cnt=1 due to the > presence > > > > of > > > > the isAuto() filter: > > > > --> SELECT (count(*) as ?cnt) { ?s ex:servedAs ex:POTUS | ?n . > > > > FILTER > > > > ( isAuto(?n) ) } > > > > Example 6: The following query returns ?minStartYr = 1885, > > > > ?maxEndYr > > > > = 1897: > > > > --> SELECT (min(?startYr) as ?minStartYr) (max(?endYr) as > > > > ?maxEndYr) > > > > { ?s ex:servedAs ex:POTUS | ?n . > > > > ?n ex:startYear ?startYr ; ex:endYear ?endYr } > > > > A custom-named statement is considered as unasserted unless an > > > > auto- > > > > named statement exists with the same <s, p, o, g>. This has > > > > implications in SPARQL query processing. A new triple-pattern > > > > format, > > > > that uses the << ... >> enclosure, is introduced in SPARQL to > > > > indicate whether matching with unasserted statements is > allowed. > > > > Example 7: Consider the following data that consists of just a > > > > single > > > > custom-named statement. Since there is no auto-named statement > > > > with > > > > <s, p, o, g> as <ex:bob, ex:fatherOf, ex:john, null> present, > the > > > > custom-named statement is considered as unasserted. The first > > > > query > > > > below is looking for match with asserted statements only and > > > > hence > > > > will return no results. The second query on the other hand is > > > > open to > > > > considering unasserted statements as well (due to the use of > the > > > > << > > > > ...>> enclosure for the triple-pattern) and will return the > > > > result: > > > > ?dad = ex:bob, ?kid = ex:john. > > > > DATA: > > > > --> ex:bob ex:fatherOf ex:john | ex:cname1 . > > > > QUERY 1: > > > > --> SELECT ?dad ?kid { ?dad ex:fatherOf ?kid } > > > > QUERY 2: > > > > --> SELECT ?dad ?kid { << ?dad ex:fatherOf ?kid >> } > > > > A few other relevant points: > > > > For cross-system sharing of query results, include a list > > > > containing > > > > <s, p, o, g, n> for each auto-generated name n that is > (directly > > > > or > > > > indirectly) included in the result: This is necessary due to > the > > > > fact > > > > that triplestores have full autonomy for implementing the > > > > function > > > > foo used for generating auto-names and therefore, given the > same > > > > <s, > > > > p, o, g>, two different triplestores could generate two > different > > > > auto-names. Hence, the recipient needs to know the <s, p, o, g> > > > > corresponding to each auto-name returned (or indirectly > involved) > > > > in > > > > the result to generate the appropriate auto-name for its local > > > > use. > > > > Statement-Set: This can be done by having multiple distinct <s, > > > > p, o, > > > > g> share the same custom-name. While the advantage over named > > > > graphs > > > > is that statements from distinct graphs (or default graph) can > > > > form a > > > > group, a disadvantage would be that auto-named statements > cannot > > > > be > > > > part of a (non-singleton) statement-set. > > > > Ref. Transparency vs. Opacity: The current idea of "opaque by > > > > default > > > > and transparent in case TEPs are involved" would work fine for > > > > RDFn > > > > too. > > > > Based on the above outline, I'd argue that use of RDFn to > support > > > > the > > > > desired extensions to RDF would also satisfy some of the > > > > practical > > > > constraints that are critical for adoption by enterprise, > > > > specifically: > > > > full backward-compatibility for RDF1.1 data (each RDF1.1 > > > > statement > > > > becomes an auto-named (asserted) statement in RDFn) > > > > continued validity of pre-existing SPARQL1.1 queries even as > data > > > > evolves to include more expressive content by taking advantage > of > > > > new > > > > capabilities to include statements about statements and multi- > > > > edges > > > > minimization of the custom naming burden on the user because > > > > custom > > > > names are needed only for those cases where multi-edges or > (non- > > > > singleton) statement-sets are involved > > > > Thanks, > > > > Souri. > > > >
Received on Tuesday, 28 November 2023 08:15:53 UTC