SPARQL 1.1 Protocol: Format of fault messages from Richard Cyganiak on 2010-09-29 (public-rdf-dawg-comments@w3.org from September 2010)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Wed, 29 Sep 2010 14:42:09 +0100
To: public-rdf-dawg-comments@w3.org
Message-Id: <24435C5C-D1E1-438A-967E-60FD75A0B9EF@cyganiak.de>
This is a comment on the latest SPARQL 1.1 Protocol draft [1]. It  
concerns an issue that was already present in the SPARQL 1.0 version  
of the Protocol.

I write this as the maintainer of two popular open source SPARQL  
protocol clients: Snorql (in use at DBpedia [2], DBLP-in-RDF [3],  
data.semanticweb.org [4], and elsewhere) and Pubby [5] (in use at  
GeoLinkedData [6], for the TCM dataset [7] and elsewhere). I would  
like to highlight a limitation of the SPARQL protocol that causes  
quite a bit of pain for users of my tools and increased support costs.  
I seek the help of the WG in addressing these issues.

Background: The SPARQL ecosystem has matured significantly in the last  
few years, and the work of this WG will bring SPARQL another major  
step forward. In the early days, a lot of SPARQL requests were sent  
simply by a human entering a SPARQL query into a query form, and the  
results were inspected manually. Software clients were built ad-hoc  
and for a specific endpoint. Now we increasingly have software clients  
that are generic and should work with any conformant SPARQL service.  
This generally works, except in one area: error messages.

Services often have to reject SPARQL queries. That is a fact of life.  
It can be because of query authoring errors, because of resource  
limitations on the service side, because of dialects and unsupported  
extensions, and for many other reasons.

In such cases, most services deliver more or less helpful error  
messages as part of the response. Thus, if a user or software  
developer interacts directly with the endpoint, then they typically  
see the error messages and can use them to resolve the issue.

But users and developers increasingly interact with SPARQL services  
indirectly, through generic software libraries or generic SPARQL query  
clients that have been built without a specific vendor's service  
implementation in mind. These generic clients should pass error  
messages from the server onwards to the user. But doing that in a  
reliable way is not possible with the current state of the SPARQL  
protocol.

The consequence is that my tools often can only tell the user that  
“The SPARQL service reported an error”. That makes it very hard for  
users to resolve the issue, or for me the developer to help them when  
I get support requests.

According to the spec, there appear to be at least three different  
formats for indicating an error:

1. XML. I don't know anything about WSDL, but my reading of 2.1.1.3 is  
that I could indicate an error like so (but can I really, and should I  
use some XML namespace, and what would the media type be?):

    <malformed-query>
      <fault-details>Parse error at [5:17], unexpected ' '</fault- 
details>
    </malformed-query>

2. Plain text. According to the example in 2.2.1.3.9, I could just  
send plain text:

    Parse error at [5:17], unexpected ' '

3. HTML. According to the example in 2.2.1.3.10, I could just send an  
HTML page containing the message:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd 
">
    <html>
    <head><title>SPARQL Processing Service: Query Request Refused</ 
title></head>
    <body>
    <h1>D'oh!</h1>
    <p>Parse error at [5:17], unexpected ' '</p>
    </body>
    </html>

There is no guidance for service implementers to choose one of these  
options.

There is no guidance for client implementers on how to express a  
preference.

There is no guidance for client implementers on how to recognize the  
format of a response.

In summary, the specification makes no effort towards establishing an  
interoperable means of delivering error messages to the client. In  
fact, in Section 2.2.1 “HTTP Bindings for SPARQL Query”, I find:

> The fault serialization of queryHttpGet and queryHttpPost is also  
> intentionally under constrained.

This intentional failure should be re-thought.

My proposal would be:

1. To state in the HTTP binding that clients SHOULD use the XML fault  
message format when reporting faults.

2. To provide a human-readable account of the XML format for fault  
messages (It is currently only specified as snippets of XML Schema and  
WSDL. What's the XML namespace, if any? What's the root element?  
What's the media type?). This could be done in the Protocol spec, or  
possibly in the SPARQL Query Results XML Format spec.

2. To provide a full example of the XML fault message format among the  
HTTP binding examples.

3. To remove the HTML example in 2.2.1.3.10. An error message embedded  
in an HTML page is not machine-readable, therefore the use of HTML as  
a way of reporting errors in a machine-oriented protocol is likely to  
be bad practice in most cases and should not be encouraged in the  
specification.

Best,
Richard


[1] http://www.w3.org/TR/2010/WD-sparql11-protocol-20100126/
[2] http://dbpedia.org/snorql/
[3] http://data.semanticweb.org/snorql/
[4] http://dblp.l3s.de/d2r/snorql/
[5] http://www4.wiwiss.fu-berlin.de/pubby/
[6] http://geo.linkeddata.es/
[7] http://www.open-biomed.org.uk/rdf-tcm/
Received on Wednesday, 29 September 2010 13:42:47 UTC