Re: An outline of RDFn -- RDF with (auto- and custom-) names

Hi Souri,

On Mon, 2023-11-27 at 18:19 +0000, Souripriya Das wrote:
> Hi Olaf,
> 
> Thanks for your comments. Let me take one comment at a time, just to
> make sure that we are on the same page before moving to the next
> comment.

Makes sense :-)

> I am not sure if you noticed the '+ explicit naming ...' in the
> following. All I was trying to say, talking as a practitioner, is
> that if you extend RDF-star by adding ("+") the idea of "explicit
> naming (using IRIs as custom names)", you can arrive at RDFn. [...]

While I saw the "+ explicit naming ..." part of that bullet point, due
to the parenthesis in between, I did indeed *not* notice that this was
meant to be a term of the equation formula in that bullet point. Thanks
for the clarification!

> > > > 1) RDFn = RDF-star (which, I think, uses implicit naming in
> > > > some sense, with << s p o >> as the name) + explicit naming
> > > > (using IRIs as custom names).
> 
> Please let me know how you feel about the above statement (and
> whether it is simple enough for a practitioner to get the basic
> idea). If we agree on this, we can move to your other comments.

Now that I understood the complete equation illustrated by this bullet
point, I agree that a practitioner may get the idea from it. (Yet, I
would suggest you remove the first parenthesis because it is
distracting.)

Having said that, I still think the (complete) equation in this bullet
point is still incorrect in terms of the details. As you may remember,
regarding the type/token distinction, quoted triples in RDF-star are
considered as types, not as tokens. In contrast, RDFn is about tokens
[2]. As a consequence, one can talk about types (of triples) in RDF-
star but not in RDFn. Therefore, the equation

    RDFn = RDF-star + explicit naming

cannot be right.

Best,
Olaf


[1] https://plato.stanford.edu/entries/types-tokens/


[2] 
https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Oct/0106.html



> Thanks,
> Souri.
> From: Olaf Hartig <olaf.hartig@liu.se>
> Sent: Monday, November 27, 2023 7:40 AM
> To: tl@rat.io <tl@rat.io>; public-rdf-star-wg@w3.org <
> public-rdf-star-wg@w3.org>; Souripriya Das <souripriya.das@oracle.com
> >
> Subject: [External] : Re: An outline of RDFn -- RDF with (auto- and
> custom-) names
>  
> Hi Thomas,
> 
> How do you know that RDFn is about tokens? I have not seen Souri
> making
> any explicit statements in this direction.
> 
> Also, it is not correct to say that "both approaches add a fifth
> element to the subject, predicate, object and graph that we already
> have."  RDF-star does not add a fifth element. Strictly speaking,
> RDF-
> star does not even have "graph" as a fourth element--there is no
> notion
> of a quad in the abstract syntax of RDF-star (and neither is there
> any
> such notion in the abstract syntax of RDF). Instead, RDF-star is
> about
> i) triples (which may be nested),
> ii) graphs as sets of such triples, and
> iii) datasets as collections of (IRI/bnode, graph) pairs, with an
> additional graph called the default graph.
> That is all there is in RDF-star. Adding "a fifth element" (as RDFn
> seems to do) requires extending the abstract syntax with additional
> concepts, and that's why "RDFn = RDF-star" is not true.
> 
> Olaf
> 
> 
> On Mon, 2023-11-27 at 11:45 +0100, Thomas Lörtsch wrote:
> > Olaf,
> >
> > you should acknowledge that RDF-star is only defined on types of
> > statements, but actual use cases in there overwhelming majority
> > (including the "seminal example" that you wrongly used in your
> papers
> > on RDF*) work on tokens. A mechanism to define a reference to such
> a
> > token is mentioned in the RDF-star CG report only in the most
> > informal way possible.
> >
> > Ergo the RDF-star formalization is irrelevant in practice and for
> > actual practical applications using some derivate of
> > ':occurrenceOf'  no formalization exists, not even an informal
> > standard vocabulary - years after the problem has been pointed out
> to
> > the CG. In that respect RDFn is definitely one or two steps ahead
> of
> > RDF-star.
> >
> > Also, I think Souri is right in another way: RDFn provides quins
> out
> > of the box, RDF-star handwavingly resorts to out-of-band means to
> > define a token identifier, but both approaches add a fifth element
> to
> > the subject, predicate, object and graph that we already have.
> > However, that is a problem with both RDFn and RDF-star, that a
> named
> > graph based approach can avoid, to considerable benefit of
> > implementors as well as users.
> >
> > Thomas
> >
> > Am 27. November 2023 10:34:45 MEZ schrieb Olaf Hartig <
> > olaf.hartig@liu.se>:
> > > Hi Souri,
> > >
> > > I don't think your claim that "RDFn = RDF-star" is true (assuming
> > > "="
> > > means something like: is the same as).
> > >
> > > In your previous email you introduce the notion of an "RDFn
> > > statement"
> > > about which you say the following.
> > >
> > > """
> > > An RDFn statement is uniquely identified using the tuple <s, p,
> o,
> > > g,
> > > n>, where the component n is the "name" of the statement. (The
> > > components s, p, and o represent the subject, predicate, and
> > > object,
> > > respectively. The component g, representing graph name, is non-
> NULL
> > > only for quads and will not be used in the examples below.)
> > > """
> > >
> > > First of all, notice that you are not explicitly saying what an
> > > RDFn
> > > statement actually is; you are only saying how it is uniquely
> > > identified. Moreover, you do not specify what kind of a thing
> this
> > > "component n" is (neither do you explicitly say what kinds of
> > > things
> > > the components s, p, o, and g are, respectively). Also, I wonder
> > > how
> > > the notion of "is uniquely identified" would be captured
> explicitly
> > > as
> > > an extension of the abstract syntax of RDF (or are you proposing
> to
> > > change the abstract syntax such that it is based on such 5-tuples
> > > rather than RDF triples??).
> > >
> > > Now, regarding your claim, your notion of an RDFn statement is
> not
> > > a
> > > concept of RDF-star [1]. Also, among the concepts of RDF-star,
> > > there is
> > > no such thing as what you informally call ''the "name" of the
> > > statement,'' and neither is there any notion of NULL in RDF-star
> > > (whereas you seem to assume such a notion for RDFn).
> > >
> > > Best,
> > > Olaf
> > >
> > > [1]
> > > 
> https://urldefense.com/v3/__https://www.w3.org/2021/12/rdf-star.html*concepts__;Iw!!ACWV5N9M2RV99hQ!Ibiq7odY3h_LSW8OEJGy61ig9MRR2G6pwS6Mr2qFRQ5vo5AYBGNwIBXLX_gvfLHTMh3uVgMmkSczC6klziFHsdCP$

>  
> > >
> > >
> > > On Mon, 2023-11-27 at 03:08 +0000, Souripriya Das wrote:
> > > > Since I did not hear any comments on RDFn during the first half
> > > > of
> > > > our last meeting that I was able to attend (except, maybe,
> Gregg
> > > > might have said something right at the beginning but I had
> audio
> > > > issues on my side), I thought it may be helpful to mention
> below
> > > > a
> > > > few high-level points about RDFn and how it is related to RDF-
> > > > star
> > > > concepts and syntax: ("statement" here simply means "a triple
> or
> > > > quad"):
> > > >
> > > > 1) RDFn = RDF-star (which, I think, uses implicit naming in
> some
> > > > sense, with << s p o >> as the name) + explicit naming (using
> > > > IRIs as
> > > > custom names).
> > > >
> > > > 2) RDFn (with appropriate syntactic shortcut) would appear
> > > > exactly
> > > > the same as RDF-star to a user who does not use multi-edges or
> > > > statement-sets.
> > > >
> > > > 3) RDFn does not change anything regarding how users work with
> > > > default graph and named graphs today.
> > > >
> > > > 4) RDFn requires use of explicit naming if user needs to store
> > > > multi-
> > > > edges. For modeling multi-edges, user does not need to
> introduce
> > > > new
> > > > triples or quads with special properties like :isOccurrenceOf
> or
> > > > :hasOccurrence.
> > > >
> > > > 5) RDFn requires use of explicit naming for modeling statement-
> > > > sets
> > > > as well. A statement-set in RDFn can include (asserted or
> > > > unasserted)
> > > > triples from the default graph and the named graphs. The
> custom-
> > > > name
> > > > of a statement-set can be used for making statements about it.
> > > >
> > > > Thanks,
> > > > Souri.
> > > > From: Souripriya Das <souripriya.das@oracle.com>
> > > > Sent: Wednesday, November 15, 2023 9:39 PM
> > > > To: RDF-star WG <public-rdf-star-wg@w3.org>
> > > > Subject: [External] : An outline of RDFn -- RDF with (auto- and
> > > > custom-) names
> > > >
> > > > As the group tries to decide on options, the following outline
> of
> > > > a
> > > > revised version of RDFn may be useful for discussions.
> > > >
> > > > Core concepts and ideas in RDFn:
> > > > An RDFn statement is uniquely identified using the tuple <s, p,
> > > > o, g,
> > > > n>, where the component n is the "name" of the statement. (The
> > > > components s, p, and o represent the subject, predicate, and
> > > > object,
> > > > respectively. The component g, representing graph name, is non-
> > > > NULL
> > > > only for quads and will not be used in the examples below.)
> > > > Example 1: An RDFn statement, with ex:jSm as its name,
> > > > representing
> > > > the tuple <ex:john, ex:spouseOf, ex:mary, null, ex:jSm>:
> > > > --> ex:john ex:spouseOf ex:mary | ex:jSm .
> > > > Based on how its name was created, a statement can belong to
> one
> > > > of
> > > > two possible types:
> > > > auto-named: The name n for an auto-named statement <s, p, o, g,
> > > > n> is
> > > > computed as rdfnAuto:foo(s, p, o, g), where
> > > > rdfnAuto is an exclusive namespace used only for names used for
> > > > auto-
> > > > named statements, and
> > > > foo is an implementation-specific function that generates
> unique
> > > > string from the <s, p, o, g> portion of the statement,
> > > > custom-named: The name of a custom-named statement is an IRI
> that
> > > > is
> > > > supplied by the data creator. (The IRI cannot have rdfnAuto as
> > > > its
> > > > namespace prefix.)
> > > > The name of a statement may be used as subject or object of
> other
> > > > statements as long as there is no direct or indirect self-
> > > > recursion
> > > > involving the name (e.g., <n, p, o, g, n> is not allowed
> because
> > > > n
> > > > has to be computed using n).
> > > > Example 2: Adding statements about an auto-named statement
> (using
> > > > placeholder for the auto-generated name):
> > > > --> ex:Cleveland ex:servedAs ex:POTUS | rdfnAuto:term1 .
> > > > --> rdfnAuto:term1 ex:startYear 1885 ; ex:endYear 1889 .
> > > > Example 3: Adding statements about a custom-named statement:
> > > > --> ex:Cleveland ex:servedAs ex:POTUS | ex:term2 .
> > > > --> ex:term2 ex:startYear 1893 ; ex:endYear 1897 .
> > > > Core concepts and ideas in SPARQLn:
> > > > A new filter isAuto(<name>) is introduced to allow
> distinguishing
> > > > between auto-named and custom-named statements. If this filter
> is
> > > > not
> > > > used, all statements will qualify, regardless whether auto-
> named
> > > > or
> > > > custom-named, provided they match regular SPARQL criteria.
> > > > Example 4: The following query returns the ?cnt = 2 if the data
> > > > about
> > > > President Cleveland's both terms (from Example 2 and Example 3
> > > > above)
> > > > are present in the RDF dataset:
> > > > --> SELECT (count(*) as ?cnt) { ?s ex:servedAs ex:POTUS }
> > > > Example 5: The following query returns ?cnt=1 due to the
> presence
> > > > of
> > > > the isAuto() filter:
> > > > --> SELECT (count(*) as ?cnt) { ?s ex:servedAs ex:POTUS | ?n .
> > > > FILTER
> > > > ( isAuto(?n) ) }
> > > > Example 6: The following query returns ?minStartYr = 1885,
> > > > ?maxEndYr
> > > > = 1897:
> > > > --> SELECT (min(?startYr) as ?minStartYr) (max(?endYr) as
> > > > ?maxEndYr)
> > > >         { ?s ex:servedAs ex:POTUS | ?n .
> > > >            ?n ex:startYear ?startYr ; ex:endYear ?endYr }
> > > > A custom-named statement is considered as unasserted unless an
> > > > auto-
> > > > named statement exists with the same <s, p, o, g>. This has
> > > > implications in SPARQL query processing. A new triple-pattern
> > > > format,
> > > > that uses the << ... >> enclosure,  is introduced in SPARQL to
> > > > indicate whether matching with unasserted statements is
> allowed.
> > > > Example 7: Consider the following data that consists of just a
> > > > single
> > > > custom-named statement. Since there is no auto-named statement
> > > > with
> > > > <s, p, o, g> as <ex:bob, ex:fatherOf, ex:john, null> present,
> the
> > > > custom-named statement is considered as unasserted. The first
> > > > query
> > > > below is looking for match with asserted statements only and
> > > > hence
> > > > will return no results. The second query on the other hand is
> > > > open to
> > > > considering unasserted statements as well (due to the use of
> the
> > > > <<
> > > > ...>> enclosure for the triple-pattern) and will return the
> > > > result:
> > > > ?dad = ex:bob, ?kid = ex:john.
> > > > DATA:
> > > > --> ex:bob ex:fatherOf ex:john | ex:cname1 .
> > > > QUERY 1:
> > > > --> SELECT ?dad ?kid { ?dad ex:fatherOf ?kid }
> > > > QUERY 2:
> > > > --> SELECT ?dad ?kid { << ?dad ex:fatherOf ?kid >> }
> > > > A few other relevant points:
> > > > For cross-system sharing of query results, include a list
> > > > containing
> > > > <s, p, o, g, n> for each auto-generated name n that is
> (directly
> > > > or
> > > > indirectly) included in the result: This is necessary due to
> the
> > > > fact
> > > > that triplestores have full autonomy for implementing the
> > > > function
> > > > foo used for generating auto-names and therefore, given the
> same
> > > > <s,
> > > > p, o, g>, two different triplestores could generate two
> different
> > > > auto-names. Hence, the recipient needs to know the <s, p, o, g>
> > > > corresponding to each auto-name returned (or indirectly
> involved)
> > > > in
> > > > the result to generate the appropriate auto-name for its local
> > > > use.
> > > > Statement-Set: This can be done by having multiple distinct <s,
> > > > p, o,
> > > > g> share the same custom-name. While the advantage over named
> > > > graphs
> > > > is that statements from distinct graphs (or default graph) can
> > > > form a
> > > > group, a disadvantage would be that auto-named statements
> cannot
> > > > be
> > > > part of a (non-singleton) statement-set.
> > > > Ref. Transparency vs. Opacity: The current idea of "opaque by
> > > > default
> > > > and transparent in case TEPs are involved" would work fine for
> > > > RDFn
> > > > too.
> > > > Based on the above outline, I'd argue that use of RDFn to
> support
> > > > the
> > > > desired extensions to RDF would also satisfy some of the
> > > > practical
> > > > constraints that are critical for adoption by enterprise,
> > > > specifically:
> > > > full backward-compatibility for RDF1.1 data (each RDF1.1
> > > > statement
> > > > becomes an auto-named (asserted) statement in RDFn)
> > > > continued validity of pre-existing SPARQL1.1 queries even as
> data
> > > > evolves to include more expressive content by taking advantage
> of
> > > > new
> > > > capabilities to include statements about statements and multi-
> > > > edges
> > > > minimization of the custom naming burden on the user because
> > > > custom
> > > > names are needed only for those cases where multi-edges or
> (non-
> > > > singleton) statement-sets are involved
> > > > Thanks,
> > > > Souri.
> > > >

Received on Tuesday, 28 November 2023 08:15:53 UTC