Re: SPARQL 1.1 Protocol: Format of fault messages

* Richard Cyganiak <richard@cyganiak.de> [2010-09-29 14:42+0100]
> This is a comment on the latest SPARQL 1.1 Protocol draft [1]. It
> concerns an issue that was already present in the SPARQL 1.0 version
> of the Protocol.
> 
> I write this as the maintainer of two popular open source SPARQL
> protocol clients: Snorql (in use at DBpedia [2], DBLP-in-RDF [3],
> data.semanticweb.org [4], and elsewhere) and Pubby [5] (in use at
> GeoLinkedData [6], for the TCM dataset [7] and elsewhere). I would
> like to highlight a limitation of the SPARQL protocol that causes
> quite a bit of pain for users of my tools and increased support
> costs. I seek the help of the WG in addressing these issues.
> 
> Background: The SPARQL ecosystem has matured significantly in the
> last few years, and the work of this WG will bring SPARQL another
> major step forward. In the early days, a lot of SPARQL requests were
> sent simply by a human entering a SPARQL query into a query form,
> and the results were inspected manually. Software clients were built
> ad-hoc and for a specific endpoint. Now we increasingly have
> software clients that are generic and should work with any
> conformant SPARQL service. This generally works, except in one area:
> error messages.
> 
> Services often have to reject SPARQL queries. That is a fact of
> life. It can be because of query authoring errors, because of
> resource limitations on the service side, because of dialects and
> unsupported extensions, and for many other reasons.
> 
> In such cases, most services deliver more or less helpful error
> messages as part of the response. Thus, if a user or software
> developer interacts directly with the endpoint, then they typically
> see the error messages and can use them to resolve the issue.
> 
> But users and developers increasingly interact with SPARQL services
> indirectly, through generic software libraries or generic SPARQL
> query clients that have been built without a specific vendor's
> service implementation in mind. These generic clients should pass
> error messages from the server onwards to the user. But doing that
> in a reliable way is not possible with the current state of the
> SPARQL protocol.
> 
> The consequence is that my tools often can only tell the user that
> “The SPARQL service reported an error”. That makes it very hard for
> users to resolve the issue, or for me the developer to help them
> when I get support requests.
> 
> According to the spec, there appear to be at least three different
> formats for indicating an error:
> 
> 1. XML. I don't know anything about WSDL, but my reading of 2.1.1.3
> is that I could indicate an error like so (but can I really, and
> should I use some XML namespace, and what would the media type be?):

I think that:
  <wsdl:portType name="SparqlQueryInterface">
    <wsdl:operation name="query">
      ...
      <wsdl:fault  message="tns:malformedQueryFault"  name="malformedQueryFault" />
      <wsdl:fault  message="tns:queryRequestRefusedFault" name="queryRequestRefusedFault" />
    </wsdl:operation>
  </wsdl:portType>

  <wsdl:binding name="QuerySoapBinding" type="tns:SparqlQueryInterface">
    <soap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http"/>
    <wsdl:operation name="query">
      <soap:operation style="document" />
      ...
      <wsdl:fault name="malformedQueryFault">
        <soap:fault use="literal"/>
      </wsdl:fault>
      <wsdl:fault name="queryRequestRefusedFault">
        <soap:fault use="literal"/>
      </wsdl:fault>
    </wsdl:operation>
  </wsdl:binding>

says that you must use:

>    <malformed-query>
>      <fault-details>Parse error at [5:17], unexpected ' '</fault-
> details>
>    </malformed-query>

for failures encountered when invoking the SOAP interface. For GET and
POST, I think that:

  <!-- the HTTP GET binding for query operation -->
  <binding name="queryHttpGet" interface="tns:SparqlQuery"
     type="http://www.w3.org/2006/01/wsdl/http"
     whttp:version="1.1">

    <fault ref="tns:MalformedQuery" whttp:code="400"/>
    <fault ref="tns:QueryRequestRefused" whttp:code="500"/>

    <operation ref="tns:query"
        wsdlx:safe="true"
    whttp:method="GET"
    whttp:faultSerialization="*/*"
    whttp:inputSerialization="application/x-www-form-urlencoded"
    whttp:outputSerialization="application/sparql-results+xml, application/rdf+xml, */*" />
  </binding>

says you can do whatever the hell you want, so long as you send the
right http code, as you describe below.

> 2. Plain text. According to the example in 2.2.1.3.9, I could just
> send plain text:
> 
>    Parse error at [5:17], unexpected ' '
> 
> 3. HTML. According to the example in 2.2.1.3.10, I could just send
> an HTML page containing the message:
> 
>    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
> "http://www.w3.org/TR/html4/strict.dtd">
>    <html>
>    <head><title>SPARQL Processing Service: Query Request Refused</
> title></head>
>    <body>
>    <h1>D'oh!</h1>
>    <p>Parse error at [5:17], unexpected ' '</p>
>    </body>
>    </html>
> 
> There is no guidance for service implementers to choose one of these
> options.
> 
> There is no guidance for client implementers on how to express a
> preference.
> 
> There is no guidance for client implementers on how to recognize the
> format of a response.
> 
> In summary, the specification makes no effort towards establishing
> an interoperable means of delivering error messages to the client.
> In fact, in Section 2.2.1 “HTTP Bindings for SPARQL Query”, I find:
> 
> >The fault serialization of queryHttpGet and queryHttpPost is also
> >intentionally under constrained.
> 
> This intentional failure should be re-thought.
> 
> My proposal would be:
> 
> 1. To state in the HTTP binding that clients SHOULD use the XML
> fault message format when reporting faults.

It may be polite to still have two, one for SOAP interfaces, which
have some standard tooling and contracts, etc. I expect your principle
issue is with the free-form beat poetry style of the rest bindings.

Let's say we told the world to use XML (we know they have an XML
parser standing by to parse the results of successful queries). We'd
kind of like to leave room for implementors to innovate and reply with
our baseline error response plus some structured stuff to say what
line and character caused what sort of error. We'd also like that to
show up in browsers. What if we say that it must appear in a pre with
a particular class?

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>D'oh!</title>
  </head>
  <body>
    <pre class="malformedQueryFault" style="display:none">Parse error at [5:17], unexpected ' '</pre>
  </body>
</html>

Folks could extend that with (X)HTML to make it prettier:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>D'oh!</title>
  </head>
  <body>
    <pre class="malformedQueryFault" style="display:none">Parse error at [5:17], unexpected ' '</pre>
    <p><span class="line">5</span>:<span class="column">17</span>: unexpected ' '</p>
  </body>
</html>

or microformats, or RDFa:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
   xmlns:sp="http://www.w3.org/2010/10/sparql-protocol#" 
>
  <head profile="http://www.w3.org/ns/rdfa/ http://www.w3.org/2005/11/profile">
    <title>D'oh!</title>
  </head>
  <body>
    <h1>D'oh!</h1>
    <pre class="malformedQueryFault" style="display:none">Parse error at [5:17], unexpected ' '</pre>
    <p><span class="line" property="my:line">5</span>:<span class="column" property="my:column">17</span>: unexpected ' '</p>
  </body>
</html>

and still be conformant.

> 2. To provide a human-readable account of the XML format for fault
> messages (It is currently only specified as snippets of XML Schema
> and WSDL. What's the XML namespace, if any? What's the root element?
> What's the media type?). This could be done in the Protocol spec, or
> possibly in the SPARQL Query Results XML Format spec.
> 
> 2. To provide a full example of the XML fault message format among
> the HTTP binding examples.
> 
> 3. To remove the HTML example in 2.2.1.3.10. An error message
> embedded in an HTML page is not machine-readable, therefore the use
> of HTML as a way of reporting errors in a machine-oriented protocol
> is likely to be bad practice in most cases and should not be
> encouraged in the specification.
> 
> Best,
> Richard
> 
> 
> [1] http://www.w3.org/TR/2010/WD-sparql11-protocol-20100126/
> [2] http://dbpedia.org/snorql/
> [3] http://data.semanticweb.org/snorql/
> [4] http://dblp.l3s.de/d2r/snorql/
> [5] http://www4.wiwiss.fu-berlin.de/pubby/
> [6] http://geo.linkeddata.es/
> [7] http://www.open-biomed.org.uk/rdf-tcm/

-- 
-ericP

Received on Friday, 1 October 2010 20:55:33 UTC