Proposal for an HTTP ERR method from Henry Story on 2004-06-23 (ietf-http-wg@w3.org from April to June 2004)

From: Henry Story <henry.story@bblfish.net>
Date: Wed, 23 Jun 2004 14:05:02 +0200
To: ietf-http-wg@w3.org
Message-Id: <8ECB0AFE-C50D-11D8-8703-000A95D9FA7A@bblfish.net>
SUMMARY:
  There is currently an asymmetry in reporting errors between the client  
and the server. The server can return an error status to the client on  
its request, but the client cannot tell the server that it has returned  
an invalid response. This proposal rectifies this problem with a  
RESTful easily implementable and backward compatible solution to this  
asymmetry, by proposing a new ERR (ERROR) HTTP method to complement  
GET, POST, PUT, ...

BACKGROUND:
   The W3C is requiring strict adherence to many new standards. XML for  
example has to be well formed, and should be rejected if not. The well  
formedness of an XML response depends on the XML payload as well as the  
HTTP headers (such as mime types) that accompany the response. If these  
are broken, as can happen all to easily when a web server is improperly  
configured, the client has no simple and automatic way of notifying the  
resource that it is broken. For B2B applications this is not too much  
of an issue, as a lot of resources and many channels are available  
between the consumer of a resource and its producer. B2B has up till  
now been the main consumer of XML. In the consumer world the dynamics  
are very different, and will lead to a widening gap between  
specification and implementation. This is why this issue has appeared  
on the Atom mailing list[2]. But I believe the proposed solution to  
that problem can be generalised in such a way as to help the forces of  
standardisation across the whole web.

PROPOSAL:
Note this is a fledgling proposal, and will clearly need some growing  
up.

When a client receives a malformed server response it CAN (SHOULD?)  
notify the resource that it is broken, by sending a ERR request,  
identical in all ways except for the ERR method to the original  
request, plus a couple of extra ERR specific headers:

	-Error-Message: a human readable standard error message
	-Error-Code: A set of to be defined error codes that categorise the  
type of error
     -Error-Spec: A pointer to RFC document sections that explain the  
error
     -Error-Date: the date the request was initially sent
     -Error-Method: the method (GET, POST, ...) of the original request.
	-Error-ContentLength: the length of the human readable error text that  
could be the body of this message

ERR should probably be limited to certain specific types of errors,  
including things like broken XML, XML encoding incorrectly specified in  
the header, or other errors relating to well known RFC or specs. This  
is to be fleshed out...

EXAMPLE:

-------8<-------
GET /index.xml HTTP/1.x
Content-encoding: text/xml; charset=UTF-8
Accept: */*
Accept-Encoding: gzip, deflate;q=1.0, identity;q=0.5, *;q=0
Accept-Language: en-us, ja;q=0.62, de-de;q=0.93, de;
...

<?xml version="1.0" encoding="iso-8859-1" ?>
<pløtz/>
------->8-------

The response is broken though clearly interpretable. Clients (in the  
wider of Consumer2C or B2C) will therefore attempt to accommodate the  
standards due to market pressure. Market pressures are close to  
physical laws in their ferocity. We cannot change them. As a result  
more an more such breakages will occur, and the standards will be left  
in the dust of this vicious whirlwind.[1] In any case fighting against  
it is going to be very tiresome.

Much easier is to require clients to at least send an ERR response to  
the resources if they are going to bypass the standards.  If you allow  
us to imagine a future where resources are intelligent enough to fix  
themselves, we can see how this can help the web heal itself,  
automatically.

Here is an example of the clients message:

-------8<-------
ERR /index.xml HTTP/1.x
Content-encoding: text/xml; charset=UTF-8
Accept: */*
Accept-Encoding: gzip, deflate;q=1.0, identity;q=0.5, *;q=0
Accept-Language: en-us, ja;q=0.62, de-de;q=0.93, de;
Error-Message: XML is of incorrect content type
Error-Code: XXXX
Error-Spec: RFCXYZ,sec 3; RFCXXX, sec54
Error-Date:  Saturday 19 June 2004, 18:05:30 GMT (whatever encoding)
Error-Method: GET
Error-ContentLength: 63

The Mime type of the content was text/xml. This requires the content to  
be in ASCII format, but we found some UTF-8 characters in the message.
We could interpret the message at present but will not necessarily be  
able to do so in the future. Please refer to RFCXYZ, sec 3 and RFCXXX,  
sec54 for more information. These can be found at http://ietf.org/
------->8-------


ADVANTAGES:

1. RESTfulness

Proxies and other intermediaries can join in to make the Web a more  
standard place.

2. Backward compatible
This proposal could very well already work with the current web  
architecture, without any problem. I have tried it myself:

-------8<-------
hjs@bblfish:0$ telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
ERR /index.html HTTP/1.1
Host: bblfish.localhost
Message: invalid XML

HTTP/1.1 501 Method Not Implemented
Date: Sat, 19 Jun 2004 10:10:37 GMT
Server: Apache/1.3.29 (Darwin)
Vary: accept-language,accept-charset
Allow: GET, HEAD, OPTIONS, TRACE
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1

14c
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>501 Method Not Implemented</TITLE>
</HEAD><BODY>
<H1>Method Not Implemented</H1>
ERROR to /index.html not supported.<P>
Invalid method in request ERROR /index.html HTTP/1.1<P>
<HR>
<ADDRESS>Apache/1.3.29 Server at bblfish.local Port 80</ADDRESS>
</BODY></HTML>
------->8-------

Clearly this is not the response we want in a web that has adopted this  
proposal, but it already has the correct side effect: namely it adds an  
error message in my apache error log:

-------8<-------
[Sat Jun 19 12:10:45 2004] [error] [client 127.0.0.1] Invalid method in  
request ERROR /index.html HTTP/1.1
------->8-------

Apart from that of course it correctly informs the client that the ERR  
message is not available, and so that sending further such requests is  
pointless.

3. This proposal avoids the vicious circle that other workarounds  
require: namely a file somewhere that specifies where to send error  
reports, this file itself perhaps being malformed.

4. The resource to which ERR is sent is known to be alive, since it  
just responded to the request. The ERR can furthermore be sent as part  
of the same tcp connection.

REFERENCES

This came out of a discussion on the atom mailing list.

[1] originally proposed here:
        http://www.imc.org/atom-syntax/mail-archive/msg05112.html
[2] a concise explanation for the need for the ERR method:
        http://www.imc.org/atom-syntax/mail-archive/msg05146.html
[3] a long discussion on #rdfig where I try to respond to all the  
questions thrown at me by Danny Ayers  
http://www.ilrt.bris.ac.uk/discovery/chatlogs/rdfig/2004-06 
-19.html#T14-48-59
[4] a Page on the Atom wiki that may be kept up to date on this issue:
	   http://www.intertwingly.net/wiki/pie/PaceErrVerb
Received on Wednesday, 23 June 2004 08:05:10 UTC