Re: Apache Jena support for RDF* from Olaf Hartig on 2020-08-09 (public-rdf-star@w3.org from August 2020)

From: Olaf Hartig <olaf.hartig@liu.se>
Date: Sun, 09 Aug 2020 10:14:42 +0200
To: public-rdf-star@w3.org
Cc: Andy Seaborne <andy@apache.org>
Message-ID: <3728669.EfhGmrF0Xd@porty3>
Hi Andy,

Thanks a lot for these implementation notes!!

Some remarks and questions inline...

On onsdag 5 augusti 2020 kl. 15:15:07 CEST Andy Seaborne wrote:
> [...]
> Using BIND for FIND blocks this because "BIND(<<>> AS ?T)" is ambiguous
> in meaning.

This very expression is not possible by the SPARQL* grammar as defined in 
Section 5.1 of http://arxiv.org/pdf/1406.3399

More specifically, the corresponding grammar rules are:

Bind ::= ’BIND’ ’(’ ExpressionOrEmbTP ’AS’ Var ’)’

ExpressionOrEmbTP ::= Expression | EmbTP

EmbTP ::= ’<<’ VarOrBlankNodeOrIriOrLitOrEmbTP Verb 
VarOrBlankNodeOrIriOrLitOrEmbTP ’>>’

In other words, you must have a triple pattern in between the '<<' and the 
'>>'.
Or are you saying that '<<>>' can also be parsed as an Expression?
 
> [...]
> Writing a grammar that distinguishes "BIND(<<>> AS ?T)" means it can't
> be plain assignment. If <<>> is also to be allowed in expressions, the
> grammar becomes complicated (several extra productions) at this point if
> we stick the simple requirements of SPARQL (LL(1)) or several steps of
> lookahead which for some parser generators is a burden (not for ARQ
> which uses JavaCC).
> 
> A different keyword removes all these problems.
> 
> The keywords MBIND (M=multiple) or TBIND were also considered.
> TRIPLETERM is a bit too long!

I am not sure I understand the exact problem you want to highlight here. Is 
the problem an issue with the SPARQL* grammar or is it an issue that the BIND 
clause in SPARQL* becomes multivalued (can result in multiple solution 
mappings) or both?

> ----
> 
> The use case for separate annotations means that parsing is SA.
> 
> <<:s :p :o>> :q 123 .
> 
> is one triple.
> 
> This flows in N-triples because "one line - one triple" is natural
> there. "wc -l" works on real world data and database dumps are more
> portable.
> 
> It also means that DELETE does not need special handling.
> [...]
> Looking up termified triples all the time seems expensive, at least
> without some machinery to know when a look up isn't necessary.

Is it fair to summarize these remarks as: an efficient implementation of SA 
mode is more straightforward than an efficient implementation of PG mode?

Speaking of these modes, the document page you mentioned earlier does not 
provide any indication from which it would be possible to infer which mode 
your current implementation supports. Which mode is it?

Thanks,
Olaf


> ---
> 
> These are decisions that seemed natural at the time - I'd expect Jena
> users at the moment to care more about compatibility across implementations.
> 
>      Andy
> 
> On 04/08/2020 11:40, Andy Seaborne wrote:
> > Jena version 3.16.0 completes the supports for RDF* and SPARQL*.
> > 
> > This is a "deep integration" - it is available by default in various
> > syntaxes and in Fuseki. The application does not need to enable it.
> > 
> > It is supported in:
> > 
> > text/turtle
> > application/n-triples
> > text/trig
> > application/n-quads
> > 
> > and for storage in-memory, and persistently in TDB (both TDB1 and TDB2).
> > 
> > For SPARQL results, it is available in formats
> > 
> >    JSON, XML, TSV, and RDF Thrift (binary), text.
> >  
> >  
> >      https://jena.apache.org/documentation/rdfstar/
> >  
> >      Andy
Received on Sunday, 9 August 2020 08:15:04 UTC