Re: Apache Jena support for RDF* from Andy Seaborne on 2020-08-09 (public-rdf-star@w3.org from August 2020)

From: Andy Seaborne <andy@apache.org>
Date: Sun, 9 Aug 2020 17:33:10 +0100
To: thomas lörtsch <tl@rat.io>
Cc: public-rdf-star@w3.org
Message-ID: <5b84257a-df5e-becb-9415-cf7671134ecc@apache.org>
On 08/08/2020 11:18, thomas lörtsch wrote:
> Thanks for the detailed description. I was wondering which mode you use and I wasn’t sure from reading the webpage you linked to. Of course I also could have just installed the thing and tried myself ;-)

If the output of parsing/INSERT is fed through an expander, it is simple 
to turn SA into PG because by defn the base triple/PG is in the same graph.

Since the reverse, PG to SA isn't so simple, Jena parsers do SA.

The reverse, cascading the delete, is the interesting part.

> 
> Do you or anybody here happen to keep track of which implementations out there support which mode - SA, PG, both?
> 
> And another question, not only w.r.t. Jena but to RDF* support in SPARQL in general:
> I can query for RDF*-triples in specific graphs by using FROM/FROM NAMED, but how can I annnotate an RDF*-triple in a specific graph? E.G. a triple might occur in many graphs but I might want to annotate only one of those occurrences, in a specifc graph. 

<Insert discussion of whether it is then the "same" triple>

> To make things a little more interesting let’s say the annotating triple that uses an RDF*-term in the subject position has to be located in the default graph or even in another named graph. Is there any way right now to do this?

Not in PG. It has to be in the same graph.

     Andy

> 
> Thomas
> 
> 
>> On 5. Aug 2020, at 16:15, Andy Seaborne <andy@apache.org> wrote:
>>
>> Implementation notes:
>>
>> ARQ supports <<>> in CONSTRUCT, VALUES, expressions and SPARQL Update.
>>
>> ----
>> There is one new production "TripleTerm" and then that is used in DataBlockValue (VALUES),  VarOrTerm (which covers BGPs, paths update templates and expressions).
>> ----
>>
>> <<>> is a new RDFTerm
>>
>> Because in Jena RDFTerms are immutable, you can't create cycles.
>>
>> ----
>>
>> There is one new operator in the algebra (TR in the paper) that is called "(find)" - it matches a <<>> pattern recursively, and assigns the top level match to a variable.
>>
>> Because this is fundamentally different to BIND -- (find) is multivalued and not a function of its arguments -- the syntax calls this FIND. This leaves open the possibility of writing <<>> in SPARQL expressions.
>>
>> Using BIND for FIND blocks this because "BIND(<<>> AS ?T)" is ambiguous in meaning.
>>
>> Jena supports functions on triple terms so it's in expressions whether indirect via variables or directly writing.
>>
>> e.g. accessors:
>>
>>    afn:subject(<<:s :p :o>>) ==> :s
>>
>> constructor:
>>
>>   afn:triple(?s, ?p, ?o) ==> << ?s ?p ?o>> if ?s ?p ?o are bound.
>>
>> which is what happens in CONSTRUCT.
>>
>> Writing a grammar that distinguishes "BIND(<<>> AS ?T)" means it can't be plain assignment. If <<>> is also to be allowed in expressions, the grammar becomes complicated (several extra productions) at this point if we stick the simple requirements of SPARQL (LL(1)) or several steps of lookahead which for some parser generators is a burden (not for ARQ which uses JavaCC).
>>
>> A different keyword removes all these problems.
>>
>> The keywords MBIND (M=multiple) or TBIND were also considered.
>> TRIPLETERM is a bit too long!
>>
>> ----
>>
>> The use case for separate annotations means that parsing is SA.
>>
>> <<:s :p :o>> :q 123 .
>>
>> is one triple.
>>
>> This flows in N-triples because "one line - one triple" is natural there. "wc -l" works on real world data and database dumps are more portable.
>>
>> It also means that DELETE does not need special handling.
>>
>> DELETE DATA { :s :p :o. }
>>
>> has a conditional side effect of
>>
>> DELETE WHERE { << :s :p :o >> ?p ?o } ;
>> DELETE WHERE { ?s ?p << :s :p :o >> }
>>
>> depending on the whole update operation. Combined with multiple requests in the same update, it effectively blocks streaming.
>>
>> Looking up termified triples all the time seems expensive, at least without some machinery to know when a look up isn't necessary.
>>
>> ---
>>
>> These are decisions that seemed natural at the time - I'd expect Jena users at the moment to care more about compatibility across implementations.
>>
>>     Andy
>>
>> On 04/08/2020 11:40, Andy Seaborne wrote:
>>> Jena version 3.16.0 completes the supports for RDF* and SPARQL*.
>>> This is a "deep integration" - it is available by default in various syntaxes and in Fuseki. The application does not need to enable it.
>>> It is supported in:
>>> text/turtle
>>> application/n-triples
>>> text/trig
>>> application/n-quads
>>> and for storage in-memory, and persistently in TDB (both TDB1 and TDB2).
>>> For SPARQL results, it is available in formats
>>>    JSON, XML, TSV, and RDF Thrift (binary), text.
>>>      https://jena.apache.org/documentation/rdfstar/
>>>      Andy
>>
>
Received on Sunday, 9 August 2020 16:33:25 UTC