Re: [External] : Re: new proposal

I like the "agreed syntax" in Turtle, but worry that the abstract syntax (and N-Triple), involving rdf:reifies and rdf:annotationOf and <<( ... )>> triple-terms , is too much of a complexity that users quite happy with LPG simplicity would prefer to stay away from. This worry is driving me to find something that captures the capability we are envisioning but without complicating what the user would see. My proposal is just trying to provide a starting point for discussion.

So, here is a variant I am currently thinking about:
1) Support two types of triples: normal RDF1.1 (asserted) triples and (new) reified triples. A reified triple is a 4-tuple whose extra component is called the reifier.
2) By default, the reifier is of rdf:type rdf:Transparent, but one could explicitly declare that a particular reifier is rdf:type rdf:Opaque. We could choose rdf:Transparent as the default if we think that would be better for most users.

The N-triple syntax used in the example below places the extra component of a reified triple as the first one that appears and is followed by a vertical bar. (This would make it similar to the portion inside <(( ... )>> inside the "agreed syntax" in Turtle.)

        :r1 | :earth :hasShape :cube . # reified triple with :r1 as the reifier
        :r2 | :bob :believes :r1 . # reified triple with :r2 as the reifier
        :r2 rdf:type rdf:Opaque . # asserted triple – indicates that the triple-term for reifier :r2 is ref. opaque
        :alice :believes :r2 . # asserted triple

I'd argue that this variant supports the full capability we envision for RDF1.2, minus the nested reification – which I do not consider that important in practice, without requiring the user to see the complications associated with the concepts of rdf:reifies, rdf:annotationOf, and possibly even triple-terms to some extent.

Again, I am just hoping that this can be a starting point for a discussion towards simplifying RDF1.2 for target users.

Thanks,
Souri.
________________________________
From: Niklas Lindström <lindstream@gmail.com>
Sent: Wednesday, July 3, 2024 8:17 AM
To: Andy Seaborne <andy@apache.org>
Cc: public-rdf-star-wg@w3.org <public-rdf-star-wg@w3.org>
Subject: Re: [External] : Re: new proposal

Hi Andy,

On Tue, Jul 2, 2024 at 10:21 PM Andy Seaborne <andy@apache.org> wrote:
>
>
>
> On 02/07/2024 12:39, Niklas Lindström wrote:
> > On Tue, Jul 2, 2024 at 12:35 PM Andy Seaborne <andy@apache.org> wrote:
> >>
> >>
> >>
> >> On 02/07/2024 05:45, Souripriya Das wrote:
> >>> The following type of nesting allows reifying the association between a
> >>> reifier and what it reifies:
> >>>           :r2 rdf:reifies  <<( :r1 rdf:reifies <<( :s :p :o )>> )>> .
> >>> Here the nesting is: "rdf;reifies -> rdf:reifies" (i.e., in the ordering
> >>> of predicates, rdf:reifies immediately follows another rdf:reifies).
> >>>
> >>> Nested beliefs are different than above and can be expressed as follows:
> >>> Using 4^th  component as the reifier:
> >>>           :s :p :o :r1 .
> >>>           :bob :believes :r1 :r2 .
> >>>           :alice :believes :r2 .
> >>> The same can be expressed using rdf:reifies as follows:
> >>>           :r1 rdf:reifies <<( :earth :hasShape :cube )>> .
> >>>           :r2 rdf:reifies <<( :bob :believes :r1 )>> .
> >>>           :alice :believes :r2 .
> >>
> >> We can have a standalone clause in the "agreed syntax" to have a
> >> standalone reification in consistent style with inline use:
> >>
> >> << :r1 | :earth :hasShape :cube >> .
> >
> > Or we could support grouping "quoted" predicate-object-pairs with the
> > same subject, add naming to the annotation syntax [1], and use that
> > same form for both asserted and unasserted triples:
> >
> >      :earth << :hasShape :cube >> {| @ :r1 |} ;
> >        :hasShape :sphere {| @ :r4 |} .
> >
> > I've waited for months to discuss syntax revisions in earnest. Alas, I
> > don't think we're on stable ground yet, but I need to mention now what
> > I see is needed for transparent many-to-many [2], if that's what we'll
> > end up with.
> >
> > If we arrive at that, we need a syntax that concisely can: 1) use
> > multiple reifiers per triple; 2) name reifiers (and have multiple
> > triples reified by the same reifier);  3) reify reifications (to be
> > able to describe a reifier which states that some triple is reified by
> > some reifier, the need for which is exemplified in [3]); 4) group
> > multiple non-asserted statements under the same subject.
>
> "need" is a high barrier for syntactic sugar.

Fair point. The "needs" were conditioned with the outcome of still
ongoing debates. Some points may turn out to be marginal edge cases.

> A design should support usage, not aim to directly provide every usage.

Yes, it should support usage. It should only prevent (presumed)
uncommon usage if absolutely necessary, such as if it's known to be
generally or even mostly the wrong choice, or constructs which are
deemed necessary but should be hidden (such as rdf:first, rdf:rest,
and, arguably, rdf:reifies).

Nor should it contain constructs confusingly looking like obsoleted
features (by which I mean triple terms used as subjects).

> Terse is barrier to use - the needs of the "general data author" and
> "general data reader" should come first that means simple expression.

Yes. By concise, I mean to avoid redundancies, where repetition makes
it harder to spot errors and/or easier to make them. The "t" in
"turtle" stands for terse.

> This is putting complex syntax forms into Turtle for specific RDF-star
> usage.

RDF-star adds complexity one way or another. Turtle-star, as it stands
with some recent modifications, is already complex: nestable triple
terms, requiring repetition of subject and predicate where Turtle
generally doesn't, using pipe as a naming operator, and a bare variant
thereof to represent *real* triple terms, plus an annotation syntax
with a rather different design and still missing a working naming
operator. It isn't clearly the best design for the unfolding but still
debated requirements. We need to look for ways to drive that
complexity down to a uniform minimum, based on what appears to be
needed for effective use (in Turtle for reading and writing, in the
abstract syntax and semantics for understanding and explaining).

(The CG report has seven EBNF productions specifically for Turtle-star
(plus one for TriG-star [1]), and currently lacks naming and "real"
triple terms. My suggestion [2] has nine; adding triple terms as
objects (only "real" ones if we can lift the restriction requiring
rdf:reifies to reference them), treats multiply reified asserted and
unasserted triples uniformly, and uses regular names to denote
reifiers, written as "footnotes" after triples.)

But again, we're not here yet; the possible merits cannot be assessed
until we agree on what we need.

> Syntax is precious.

Of course. I'm not suggesting this light-heartedly. And it reasonably
takes some time to be comfortable with a change in syntax, having
looked at a given syntax for years. We're a long way down that road,
but until Last Call it can be changed. It decidedly changed meaning
this year, and has already been provisionally altered because of it.

> Syntax is a value-judgement.

Certainly. We need to quantify, compare and converge on what we value
as a group, catering for a much wider group.

> The risk is that choices will burden Turtle and Turtle implementation
> for the future with little used features. Feature need more
> justification than some use case.

Yes, there's a lot of risk here. Adding triple terms requires an
assessment of how they will be understood and used at large. The four
points assuredly differ in expected commonality and possible value,
and different WG members would rank them differently. These
differences may even affect the choice of adding triple terms or
something else.

I am suggesting this since it appeared as a pattern during work to get
a feel for the many-to-many option with lots of real data. Take
Wikidata, where the current syntax has to repeat the same object for
each qualification reifier; and if disputed, low-ranked snaks were to
be represented as unasserted -- which they debatably should be --
those become visually disconnected heaps of unasserted << ... >> forms
repeating the same subject. A terse RDF triple language should
reasonably cater better for that.

I have previously thought of the use of rdf:reifies within triples
term as a marginal edge case (I've only seen it crop up in my work on
blame views of graph diffs), but then I saw their apparent need in
[3]. Admittedly I was looking for them, and I do understand and
respect e.g. Souri's worries about them (presuming that those come
from the RDFn quin design). I just don't see how to refer to the
connection between a reifier and one of its triples without that
(unless we go back to a rigid, exclusive many-to-one for which we have
no working design).

Which use cases do you deem so marginally expected that they don't
merit syntactic consideration?

> We should leave open the possibility of graphs-in-graphs.

Yes.

Best regards,
Niklas

[1]: <https://urldefense.com/v3/__https://w3c.github.io/rdf-star/cg-spec/2021-12-17.html*trig-star-grammar__;Iw!!ACWV5N9M2RV99hQ!ModPHiqXu_SgUHA7vFT9E2UCtR22OHgpNNERiWG29zZFNKv1iObrtrFx_xDNmrsNm1fEAvqPGJLcsWnGwa2xGpg$ >

[2]: Alternative TriG productions for namable, many-to-many-reifiers:
    [7] predicateObjectList ::= predicateObjectForms (';'
predicateObjectForms?)*
    [1r] predicateObjectForms ::= verbObjectList | verbObjectOfTripleTerm
    [2r] verbObjectList ::= verb objectList
    [3r] verbObjectOfTripleTerm ::= '<<' verb object '>>' annotation
    [8] objectList ::= object annotation? (',' object annotation?)*
    [12] object ::= iri | blank | blankNodePropertyList | literal | tripleTerm
    [4r] annotation ::= '^{' reifier annotation? (',' reifier annotation?)* '}'
    [5r] reifier ::= iri | blank | blankNodePropertyList
    [6r] tripleTerm ::= '<<' subject verb object '>>'

[3]: <https://urldefense.com/v3/__https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jun/0052.html__;!!ACWV5N9M2RV99hQ!ModPHiqXu_SgUHA7vFT9E2UCtR22OHgpNNERiWG29zZFNKv1iObrtrFx_xDNmrsNm1fEAvqPGJLcsWnGE6M-I1A$ >






> > The current syntax suffers here. If we ignore inertia for a moment, a
> > syntax along these lines may be a wiser choice, since it handles all
> > four:
> >
> >      :earth << :hasShape :disc >> ^{:r1, :r2 ^{:r3}} ;
> >        << :hasShape :cube >> ^{:r1} ;
> >        :hasShape :sphere ^{:r4} .
> >
> > Which would expand to:
> >
> >      :earth :hasShape :sphere .
> >      :r1 rdf:reifies <<( :earth :hasShape :disc )>> .
> >      :r1 rdf:reifies <<( :earth :hasShape :cube )>> .
> >      :r2 rdf:reifies <<( :earth :hasShape :disc )>>
> >      :r3 rdf:reifies <<( :r2 rdf:reifies <<( :earth :hasShape :disc )>> )>> .
> >      :r4 rdf:reifies <<( :earth :hasShape :sphere )>> .
> >
> > In fact, if we dared to invalidate the << ... >> form in the subject
> > position syntax-wise (on the grounds that it gives the wrong
> > impression), I might even dare to go back to using that form for the
> > triple terms themselves. And if we disallowed those as subjects in all
> > cases (apart from generalized RDF for the "entailment space"), we
> > could skip the "well-formedness" part and allow "bare" triple terms in
> > the object position. I'm quite *nervous* about that, but not *adverse*
> > to it.
> >
> > But we don't need to debate the merits or alternatives more until we
> > agree on the numbered points above, based on use case requirements in
> > relation to a given baseline. Such as [3] in relation to [2].
> >
> > (For anyone actually eager for the syntax debate: Of course I've
> > written the EBNF and implemented this, and some variations; to see
> > what technically works. I added a marker before the braces mostly in
> > case a future RDF adds some form of graph literals to the mix. It
> > isn't strictly necessary. I do think, since it is used as the inverse
> > operator on predicates in SPARQL, that it also signals an "arrow" to
> > the previous (or above) statement, signifying that the rdf:reifies
> > goes from the reifier to the triple.)
> >
> > Best regards,
> > Niklas
> >
> > [1]: <https://urldefense.com/v3/__https://github.com/w3c/rdf-star-wg/issues/116__;!!ACWV5N9M2RV99hQ!ModPHiqXu_SgUHA7vFT9E2UCtR22OHgpNNERiWG29zZFNKv1iObrtrFx_xDNmrsNm1fEAvqPGJLcsWnGAp9RQtc$ >
> > [2]: <https://urldefense.com/v3/__https://github.com/w3c/rdf-star-wg/wiki/RDF-star-*22minimal-baseline*22__;JSU!!ACWV5N9M2RV99hQ!ModPHiqXu_SgUHA7vFT9E2UCtR22OHgpNNERiWG29zZFNKv1iObrtrFx_xDNmrsNm1fEAvqPGJLcsWnGgZuB5gA$ >
> > [3]: <https://urldefense.com/v3/__https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jun/0052.html__;!!ACWV5N9M2RV99hQ!ModPHiqXu_SgUHA7vFT9E2UCtR22OHgpNNERiWG29zZFNKv1iObrtrFx_xDNmrsNm1fEAvqPGJLcsWnGE6M-I1A$ >
> >
> >
> >>       Andy
> >>
> >> https://urldefense.com/v3/__https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0095.html__;!!ACWV5N9M2RV99hQ!ModPHiqXu_SgUHA7vFT9E2UCtR22OHgpNNERiWG29zZFNKv1iObrtrFx_xDNmrsNm1fEAvqPGJLcsWnGHE0FODk$

> >>

Received on Wednesday, 3 July 2024 13:41:40 UTC