- From: Thomas Lörtsch <tl@rat.io>
- Date: Mon, 27 Nov 2023 11:45:42 +0100
- To: public-rdf-star-wg@w3.org, Olaf Hartig <olaf.hartig@liu.se>, "public-rdf-star-wg@w3.org" <public-rdf-star-wg@w3.org>, "souripriya.das@oracle.com" <souripriya.das@oracle.com>
Olaf, you should acknowledge that RDF-star is only defined on types of statements, but actual use cases in there overwhelming majority (including the "seminal example" that you wrongly used in your papers on RDF*) work on tokens. A mechanism to define a reference to such a token is mentioned in the RDF-star CG report only in the most informal way possible. Ergo the RDF-star formalization is irrelevant in practice and for actual practical applications using some derivate of ':occurrenceOf' no formalization exists, not even an informal standard vocabulary - years after the problem has been pointed out to the CG. In that respect RDFn is definitely one or two steps ahead of RDF-star. Also, I think Souri is right in another way: RDFn provides quins out of the box, RDF-star handwavingly resorts to out-of-band means to define a token identifier, but both approaches add a fifth element to the subject, predicate, object and graph that we already have. However, that is a problem with both RDFn and RDF-star, that a named graph based approach can avoid, to considerable benefit of implementors as well as users. Thomas Am 27. November 2023 10:34:45 MEZ schrieb Olaf Hartig <olaf.hartig@liu.se>: >Hi Souri, > >I don't think your claim that "RDFn = RDF-star" is true (assuming "=" >means something like: is the same as). > >In your previous email you introduce the notion of an "RDFn statement" >about which you say the following. > >""" >An RDFn statement is uniquely identified using the tuple <s, p, o, g, >n>, where the component n is the "name" of the statement. (The >components s, p, and o represent the subject, predicate, and object, >respectively. The component g, representing graph name, is non-NULL >only for quads and will not be used in the examples below.) >""" > >First of all, notice that you are not explicitly saying what an RDFn >statement actually is; you are only saying how it is uniquely >identified. Moreover, you do not specify what kind of a thing this >"component n" is (neither do you explicitly say what kinds of things >the components s, p, o, and g are, respectively). Also, I wonder how >the notion of "is uniquely identified" would be captured explicitly as >an extension of the abstract syntax of RDF (or are you proposing to >change the abstract syntax such that it is based on such 5-tuples >rather than RDF triples??). > >Now, regarding your claim, your notion of an RDFn statement is not a >concept of RDF-star [1]. Also, among the concepts of RDF-star, there is >no such thing as what you informally call ''the "name" of the >statement,'' and neither is there any notion of NULL in RDF-star >(whereas you seem to assume such a notion for RDFn). > >Best, >Olaf > >[1] https://www.w3.org/2021/12/rdf-star.html#concepts > > >On Mon, 2023-11-27 at 03:08 +0000, Souripriya Das wrote: >> Since I did not hear any comments on RDFn during the first half of >> our last meeting that I was able to attend (except, maybe, Gregg >> might have said something right at the beginning but I had audio >> issues on my side), I thought it may be helpful to mention below a >> few high-level points about RDFn and how it is related to RDF-star >> concepts and syntax: ("statement" here simply means "a triple or >> quad"): >> >> 1) RDFn = RDF-star (which, I think, uses implicit naming in some >> sense, with << s p o >> as the name) + explicit naming (using IRIs as >> custom names). >> >> 2) RDFn (with appropriate syntactic shortcut) would appear exactly >> the same as RDF-star to a user who does not use multi-edges or >> statement-sets. >> >> 3) RDFn does not change anything regarding how users work with >> default graph and named graphs today. >> >> 4) RDFn requires use of explicit naming if user needs to store multi- >> edges. For modeling multi-edges, user does not need to introduce new >> triples or quads with special properties like :isOccurrenceOf or >> :hasOccurrence. >> >> 5) RDFn requires use of explicit naming for modeling statement-sets >> as well. A statement-set in RDFn can include (asserted or unasserted) >> triples from the default graph and the named graphs. The custom-name >> of a statement-set can be used for making statements about it. >> >> Thanks, >> Souri. >> From: Souripriya Das <souripriya.das@oracle.com> >> Sent: Wednesday, November 15, 2023 9:39 PM >> To: RDF-star WG <public-rdf-star-wg@w3.org> >> Subject: [External] : An outline of RDFn -- RDF with (auto- and >> custom-) names >> >> As the group tries to decide on options, the following outline of a >> revised version of RDFn may be useful for discussions. >> >> Core concepts and ideas in RDFn: >> An RDFn statement is uniquely identified using the tuple <s, p, o, g, >> n>, where the component n is the "name" of the statement. (The >> components s, p, and o represent the subject, predicate, and object, >> respectively. The component g, representing graph name, is non-NULL >> only for quads and will not be used in the examples below.) >> Example 1: An RDFn statement, with ex:jSm as its name, representing >> the tuple <ex:john, ex:spouseOf, ex:mary, null, ex:jSm>: >> --> ex:john ex:spouseOf ex:mary | ex:jSm . >> Based on how its name was created, a statement can belong to one of >> two possible types: >> auto-named: The name n for an auto-named statement <s, p, o, g, n> is >> computed as rdfnAuto:foo(s, p, o, g), where >> rdfnAuto is an exclusive namespace used only for names used for auto- >> named statements, and >> foo is an implementation-specific function that generates unique >> string from the <s, p, o, g> portion of the statement, >> custom-named: The name of a custom-named statement is an IRI that is >> supplied by the data creator. (The IRI cannot have rdfnAuto as its >> namespace prefix.) >> The name of a statement may be used as subject or object of other >> statements as long as there is no direct or indirect self-recursion >> involving the name (e.g., <n, p, o, g, n> is not allowed because n >> has to be computed using n). >> Example 2: Adding statements about an auto-named statement (using >> placeholder for the auto-generated name): >> --> ex:Cleveland ex:servedAs ex:POTUS | rdfnAuto:term1 . >> --> rdfnAuto:term1 ex:startYear 1885 ; ex:endYear 1889 . >> Example 3: Adding statements about a custom-named statement: >> --> ex:Cleveland ex:servedAs ex:POTUS | ex:term2 . >> --> ex:term2 ex:startYear 1893 ; ex:endYear 1897 . >> Core concepts and ideas in SPARQLn: >> A new filter isAuto(<name>) is introduced to allow distinguishing >> between auto-named and custom-named statements. If this filter is not >> used, all statements will qualify, regardless whether auto-named or >> custom-named, provided they match regular SPARQL criteria. >> Example 4: The following query returns the ?cnt = 2 if the data about >> President Cleveland's both terms (from Example 2 and Example 3 above) >> are present in the RDF dataset: >> --> SELECT (count(*) as ?cnt) { ?s ex:servedAs ex:POTUS } >> Example 5: The following query returns ?cnt=1 due to the presence of >> the isAuto() filter: >> --> SELECT (count(*) as ?cnt) { ?s ex:servedAs ex:POTUS | ?n . FILTER >> ( isAuto(?n) ) } >> Example 6: The following query returns ?minStartYr = 1885, ?maxEndYr >> = 1897: >> --> SELECT (min(?startYr) as ?minStartYr) (max(?endYr) as ?maxEndYr) >> { ?s ex:servedAs ex:POTUS | ?n . >> ?n ex:startYear ?startYr ; ex:endYear ?endYr } >> A custom-named statement is considered as unasserted unless an auto- >> named statement exists with the same <s, p, o, g>. This has >> implications in SPARQL query processing. A new triple-pattern format, >> that uses the << ... >> enclosure, is introduced in SPARQL to >> indicate whether matching with unasserted statements is allowed. >> Example 7: Consider the following data that consists of just a single >> custom-named statement. Since there is no auto-named statement with >> <s, p, o, g> as <ex:bob, ex:fatherOf, ex:john, null> present, the >> custom-named statement is considered as unasserted. The first query >> below is looking for match with asserted statements only and hence >> will return no results. The second query on the other hand is open to >> considering unasserted statements as well (due to the use of the << >> ...>> enclosure for the triple-pattern) and will return the result: >> ?dad = ex:bob, ?kid = ex:john. >> DATA: >> --> ex:bob ex:fatherOf ex:john | ex:cname1 . >> QUERY 1: >> --> SELECT ?dad ?kid { ?dad ex:fatherOf ?kid } >> QUERY 2: >> --> SELECT ?dad ?kid { << ?dad ex:fatherOf ?kid >> } >> A few other relevant points: >> For cross-system sharing of query results, include a list containing >> <s, p, o, g, n> for each auto-generated name n that is (directly or >> indirectly) included in the result: This is necessary due to the fact >> that triplestores have full autonomy for implementing the function >> foo used for generating auto-names and therefore, given the same <s, >> p, o, g>, two different triplestores could generate two different >> auto-names. Hence, the recipient needs to know the <s, p, o, g> >> corresponding to each auto-name returned (or indirectly involved) in >> the result to generate the appropriate auto-name for its local use. >> Statement-Set: This can be done by having multiple distinct <s, p, o, >> g> share the same custom-name. While the advantage over named graphs >> is that statements from distinct graphs (or default graph) can form a >> group, a disadvantage would be that auto-named statements cannot be >> part of a (non-singleton) statement-set. >> Ref. Transparency vs. Opacity: The current idea of "opaque by default >> and transparent in case TEPs are involved" would work fine for RDFn >> too. >> Based on the above outline, I'd argue that use of RDFn to support the >> desired extensions to RDF would also satisfy some of the practical >> constraints that are critical for adoption by enterprise, >> specifically: >> full backward-compatibility for RDF1.1 data (each RDF1.1 statement >> becomes an auto-named (asserted) statement in RDFn) >> continued validity of pre-existing SPARQL1.1 queries even as data >> evolves to include more expressive content by taking advantage of new >> capabilities to include statements about statements and multi-edges >> minimization of the custom naming burden on the user because custom >> names are needed only for those cases where multi-edges or (non- >> singleton) statement-sets are involved >> Thanks, >> Souri. >>
Received on Monday, 27 November 2023 10:46:00 UTC