Extending the SPARQL Query Results JSON format for RDF*

Hi folks,

In Eclipse RDF4J's initial implementation for SPARQL* and RDF* (see https://rdf4j.org/documentation/programming/rdfstar/), one of the things we did was to provide extensions to the result formats for SPARQL (SELECT) query results, in particular for the SPARQL Query Results JSON format (see https://www.w3.org/TR/sparql11-results-json/). This is necessary to send query results that involve RDF* data from a SPARQL endpoint to a client. 

I'd like to quickly present our approach, and to hear how others solved these issues. Ideally I'd like to reach some community consensus and get to a state where different tools at least support compatible response formats. It would be nice if I could use RDF4J to query any RDF*-supporting endpoint and get back a response in a format it can work with (btw I'm concentrating on the JSON format here as low-hanging fruit, but we could have a similar discussion about the XML format or even other RDF serialization formats). 

RDF4J extends the Query Results JSON format in the following fashion. First of all, to encode a variable that is bound to an RDF* triple (as opposed to a URI, BNode, or literal) we introduce a new binding type:  triple. As an example, the JSON result for the usual SPARQL query about the certainty of Bob's age would look like this:


`{
  "head" : {
    "vars" : [
      "a",
      "b",
      "c"
    ]
  },
  "results" : {
    "bindings": [
      { "a" : {
          "type" : "triple",
          "value" : {
            "s" : {
              "type" : "uri",
              "value" : "http://example.org/bob"
            },
            "p" : {
              "type" : "uri",
              "value" : "http://xmlns.com/foaf/0.1/age <http://xmlns.com/foaf/0.1/name>"
            },
            "o" : {
              "datatype" : "http://www.w3.org/2001/XMLSchema#integer",
              "type" : "literal",
              "value" : "23"
            }
          }
        },
        "b": { 
          "type": "uri",
          "value": "http://example.org/certainty"
        },
        "c" : {
          "datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
          "type" : "literal",
          "value" : "0.9"
        }
      }
    ]
  }
}`

The second thing we did was introduce a separate MIME-type and file extension for the extended format: application/x-sparqlstar-results+json , file extension .srjs. The reasons we had for doing so are twofold:

 1. having a separate content type for the RDF*-enabled format allows clients to explicitly ask for it if they can process it;
 2. conversely, if a client does a SPARQL query on an RDF*-enabled endpoint but has no client-side capability to deal with RDF* data, we want to send them a standard-compliant JSON response that they can still process (somehow compressing the triple into a compatible representation, e.g. a hashed identifier), not a custom extension that will likely crash their parser. 

I am aware of some similar approaches in other tools. For example, Stardog has made very similar extensions in its edge property support, in fact their JSON format extension is almost identical to RDF4J's approach, except for two things:

 1. they named the new binding type statement instead of triple ; 
 2. they are reusing the existing SPARQL Query Results JSON MIME-type (application/sparql-results+json). 

I can see advantages to this approach as well: it will mean less custom content type juggling, for one thing, and fewer edge cases that serializer implementations will have to deal with. 

I'd like hear more examples of other approaches (I imagine other tools have made similar custom extensions of existing formats), compare approaches, and perhaps come up with some sort of draft that we can put up as a best practice for tool implementors. 

Kind regards,

Jeen

Received on Tuesday, 4 August 2020 00:24:33 UTC