Comments on SPARQL Query Results XML Format

Dear all,

Here's another round of feedback, this time on the query results format.
The feedback is based on version 1.20 (2005/02/28) of the editor's
draft.

* The current specification only lends itself for encoding bindings of
   values to variables. This is due to the fact that variable names are
   used as tag names. If, at some point in time (but hopefully soon:-)),
   the SPARQL language is changed to allow functions and/or operators in
   the projection, the format as it is defined now is no longer usable.
   To be a little more specific: if SPARQL would allow functions and/or
   (aggregation) operators in the SELECT-clause, then the query result
   can no longer be seen as a variable binding. Some (pseudo-) examples
   to clarify this:

   Q1: SELECT COUNT(X), Y WHERE ... GROUP BY ...

   Q2: SELECT concat(X,Y), sum(A,B) WHERE ...

   IMHO, it would be wise to create a format that is more flexible with
   respect to such future extensions.

* I think the choice to use variable names for tags is poor one. Not
   only because of the things that were said above, but also because it
   will probably make validation harder. At least, the current format
   does not allow one to define a (fixed) DTD for it.

* The use of a non-fixed set of tags doesn't make it easier to parse
   back a query result the SAX-way. With a fixed set of tags, the SAX
   listener could have responded to method calls for specific tags. As it
   is now, a SAX-based parser will have to keep track of whether the
   reported element is a child of a <result> element, or it has to check
   if any of the element's attributes indicates that it is a value.

   Note that this procedure also has to be applied to the set of tags
   that are defined for the format, as this set is not disjunct with the
   set of variable names. A hopefully clarifying example:

   Query:
     SELECT ?sparql ?results ?result ?boolean-result
     WHERE { ... }

   Example output:
     <?xml version="1.0"?>
     <sparql xmlns="http://www.w3.org/2001/sw/DataAccess/rf1/result">
       <head>
         <variable name="sparql"/>
         <variable name="results"/>
         <variable name="result"/>
         <variable name="boolean-result"/>
       </head>
       <results>
         <result>
           <sparql bnodeid="r2"/>
           <results uri="http://work.example.org/bob/"/>
           <result xml:lang="en">Bob</result>
           <boolean-result uri="mailto:bob@work"/>
         </result>
         ...
       </results>
     </sparql>

   Not a pretty sight, if you ask me...

* All unqualified tags are part of the namespace
     "http://www.w3.org/2001/sw/DataAccess/rf1/result",
   effectively making this a namespace with a dynamic set of elements.

* Conceptually, the boolean result doesn't have anything in common with
   the variable binding result, so why use the same format? IMHO, it
   doesn't make much sense; this is the most complex format for
   communicating a simple yes/no that I've ever seen.

   Also, because the same document structure is used, one has to analyze
   the complete document before one can decide whether it represents a
   variable binding- or boolean result.

   I would suggest to use a dedicated format for the boolean result.
   Something like the following would be a good alternative:

   <?xml version="1.0"?>
   <sparql-ask xmlns="http://www.w3.org/2001/sw/DataAccess/rf1/result">
     <true/>
   </sparql-ask>


All in all, based on the above comments, I would like to ask the DAWG to
_please_ consider dropping the idea on using variable names for tags and
instead adopt a format with a fixed DTD. For example something like:

<?xml version="1.0"?>
<sparql-select xmlns="http://www.w3.org/2001/sw/DataAccess/rf1/result">
   ...  head ...
   <results>
     <result>
       <bnode var="x" id="r2"/>
       <uri var="hpage">http://work.example.org/bob/</uri>
       <literal var="name" xml:lang="en">Bob</literal>
       <literal var="mbox">bob@work</literal>
    </result>
     ...
   </results>
</sparql-select>


Regards,

Arjohn Kampman

-- 
arjohn.kampman@aduna.biz
Aduna BV - http://aduna.biz/
Prinses Julianaplein 14-b, 3817 CS Amersfoort, The Netherlands
tel. +31-(0)33-4659987  fax. +31-(0)33-4659987

Received on Wednesday, 23 March 2005 10:07:57 UTC