Re: Extending the SPARQL Query Results JSON format for RDF*

On Thu, 6 Aug 2020 at 00:09, Andy Seaborne <andy@apache.org> wrote:

On 05/08/2020 00:11, Jeen Broekstra wrote:
>
> > How/where would be a good place to draft and publish something? Is the
> > SPARQL 1.2 CG a reasonable place for this perhaps? Or do we want to
> > keep RDF* separate from that discussion for now?
>
> Neutral.
>

Given the other discussion thread, I suggest we follow Olaf's lead and  see
about contributing something to the RDF-DEV CG.

> It's not so much the performance I worry about, it's more a backward
> > compatibility thing. Imagine an endpoint that starts appending its
> > dataset with RDF* annotations, and multiple existing clients that
> > query that endpoint. If you support the query result response by
> > extending the existing content type, that existing client can suddenly
> > start receiving a response it can't process on an existing query
> > (after all you can get back an RDF* annotation as a result even if
> > your query is just regular SPARQL).
> >
> > RDF4J currently handles this by only sending the extended syntax when
> > a client explicitly accepts the new content-type. If a client asks for
> > "regular" json results, we instead encode any annotated triple in the
> > result as an IRI (basically by base 64-encoding the N-triples
> > representation of the statement and minting a urn out of it on the
> > spot). It may not be able to fully interpret this kind of result
> > value, but at least it won't break the parser.
>
> Related: for the java-typed URI, RDF* triple terns had to go under IRI /
> bnode because they can appear in the subject position. Yet they are
> conceptually literals.
>

We've skirted that in RDF4J by just introducing a new resource type in the
Java API: Triple. Doing so also helped us introduce RDF* support into the
API in a way that would minimize issues with backward compatibility. Though
to be fair we are currently considering some amendments to this setup.

> Given that client software will need to be updated anyway to
> > /properly/ do useful things with RDF* data in query results, the
> > addition of a MIME-type seems little additional burden.
>
> Right, I don't disagree with that - there are pros and cons for either way.
>

FWIW I'm still on the fence myself.

I don't think it is always direct application-server, but also app-other
> software/library-server and intermediate software isn't aware of the app
> using/not-understanding RDF*. Even some libraries make application
> access to MIME type control quite difficult because they present a
> simplification and hide the MIME-foo to return a whatever-prog-language
> datastructure.
>
> From experience, MIME types are only patchily understood by users. Some
> users deeply understand and care about the web aspects, some are data
> specialistic who see it and a lot of HTTP as just a mechanism they have
> to use, more getting in the way, "just give me the data!". Which is fine
> - we can't expect everyone to know all the details of everything.


Since at metaphacts we feel a certain urgency to be compatible with as many
databases as possible, it's likely we'll push for some improvements in the
RDF4J parser implementations as a short-term solution: it wouldn't be hard
to adapt the parser to also accept exended format on the regular mime-type,
and also to make it parse the syntax variants (they're small enough changes
that we can support that without a significant performance penalty I
think). When *writing, *the writer could still distinguish between the two
content-types. None of that negates the need for reaching a consensus
though: ideally there'd be a single standard (insert relevant xkcd).


> >> One use case that has arisen is wanting to manage the triples annotating
> >> other triples separately from the data it refers to.  This is both to
> >> help in data management and also to help with the modelling issues [1]
> >
> > I don't follow how this relates to the syntax formats to be honest,
> > but isn't that essentially what Separate Assertions (SA) mode gives
> > you? In SA mode you could have the annotations in a separate named
> > graph (or a separate database if you want) from the actual facts being
> > annotated.
>
> Yes. This is what Ontotext GraphDB documentation says as well.
>
> If the app wants assertion as well, feeding the parser outstream though
> a pipeline to assert the triple is easy - the reverse, AS from PG
> parsing, would not be. Ditto API implications.
>

Quite - it's why we (and Ontotext as well) went with SA as the default for
now.

One thing I'd be particularly interested in in systems that use the PG
approach is how they manage retraction, and how well that scales.

>> Jena can also read Eclipse RDF4J format result sets :-)
> >
> > Showoff :)
> Reading documentation considered harmful?
>

Just slightly jealous that I can't make the same claim in reverse yet.

Cheers,

Jeen

-- 
*Dr Jeen Broekstra*
*principal software engineer*

jb@metaphacts.com
www.metaphacts.com

[image: htps://www.metaphacts.com/] <https://www.metaphacts.com/>

Received on Thursday, 6 August 2020 12:20:12 UTC