Re: SPARQL Query Results XML Format

On Thu, 28 Apr 2005 19:23:14 +0200, Jeen Broekstra <jeen@aduna.biz> wrote:

> Dave Beckett wrote:
> > I've updated the XML results format editor's draft
> > 
> >   $Revision: 1.24 $ of $Date: 2005/04/25 16:11:26 $
> >   http://www.w3.org/2001/sw/DataAccess/rf1/
> > 
> > to reflect issues brought up and after earlier discussion.
> 
> Nice job, I think it's a big improvement. I still have a few remarks 
> though :)
> 
> > The changes are primarily as follows:
> > 
> > 1. Switched to a form where the variable name is not used as an
> >    element name. It is now of the form <binding name="var">
>  >
> > 2. Added a boolean result form for ASK.
> > 
> > 3. Added sub-elements of binding for the RDF Term types:
> >    <bnode>, <uri>, <literal>
> 
> Regarding points 1 and 3: I wonder why the <binding> element is 
> necessary. An alternative would be to eliminate it and have the 
> <bnode>, <uri> and <literal> elements directly as subelements of 
> <result>, with the var attribute, like so:
> 
>      <result>
>        <bnode var="x">r2</bnode>
>        <uri var="hpage">http://work.example.org/bob/</uri>
>        <literal var="name" xml:lang="en">Bob</literal>
>        <uri var="mbox">mailto:bob@work.example.org</uri>
>        <literal var="age" xsi:type="xs:integer" 
> datatype="http://www.w3.org/2001/XMLSchema#integer">30</literal>
>        <unbound var="blurb"/>
>        <bnode var="friend">r1</bnode>
>      </result>
> 
> I guess it comes down to taste as much as anything, but this would be 
> more compact and arguably just as easy to parse/validate.

Yes it's somewhat to do with style.  Some reasons for <binding> that
I can think of are:

  * It matches more directly the terms that the query spec uses -
   result set, result & variable bindings.

  * Without <binding> it's more implict that <uri var="hpage"> is
    that the value of hpage is a URI.  In XML, the value of an
    element is usually the element content, which is what
      <binding name="hpage"><uri></binding>
    says to the casual reader.  There's no need to make that harder.

  * The XML element paths to the bindings are all in terms of fixed
    element names: /sparql/results/result/binding
    which fits the style.  Otherwise the last one is uri|literal|bnode

> As for point 2, the boolean results format: my colleague Arjohn came 
> up with the following alternatives:
> 
> Alternative 1:
> 
> <?xml version="1.0"?>
> <sparql-ask xmlns="http://www.w3.org/2001/sw/DataAccess/rf1/result2">
>    <true/>
> </sparql-ask>
> 
> Alternative 2:
> 
> <?xml version="1.0"?>
> <sparql-ask xmlns="http://www.w3.org/2001/sw/DataAccess/rf1/result2">
>    <literal xsi:type="xs:boolean"
>       datatype="http://www.w3.org/2001/XMLSchema#boolean>true</literal>
> </sparql-ask>
> 
> Using a different root element makes it immediately clear that this is 
> a fundamentally different kind of result - after all, it is not a 
> variable binding result so it seems a bit awkward to try and fit it in 
> the same Schema. A separate XML Schema for boolean results could be 
> trivially simple.

But it would have twice the maintenance for multiple code paths,
implementing classes, XML processing paths and in specification
terms, document management, mime types, identifying URIs and so on.

Also, if warnings and errors were wanted to be recorded above the
protocol level, which is what <head> is somewhat reserved for, the
two formats would both need to track such changes.

It might be that we decided to support all sparql result formats by
adding returning rdf graphs in the XML by dropping into rdf:RDF at
some point.  This is just an idea, one which I don't see the need to
do at present as RDF/XML is well defined and separate, we just need
to cover the other cases.

I'd prefer splitting the formats only if absolutely necessary.

> Regarding the use of an index attribute for denoting order: I was a 
> bit surprised to see that since I thought the idea was to just add a 
> 'ordered="true"' attribute to the header somewhere and leave ordering 
> up to XML element order. This automatically makes it clear that the 
> ordered result can be processed in a streaming fashion as well.

Wasn't the point that it could be a partial order (1 1 2 3 4 4 5)?  I
don't think we have finally decided what sorting gives yet.  If it's
a partial ordering, but a consistent ordering, then I agree it just
needs a boolean.  It could either be in the header or maybe would
make more sense as <results ordered="true"> ?

Dave

Received on Tuesday, 3 May 2005 10:25:14 UTC