Re: Proposal for an HTTP ERR method

Henry Story wrote:
> >If something is smart enough to fix itself, it is most likely smart
> >enough to test itself, instead of relying on a client to do the
> >testing. It is also probably smart enough not to trust test results
> >from untrusted clients.
> 
> I am smart enough to test my code. I still find errors in it. Recently 
> my nice OSX machine crashed. Apple tests its code as far as I know.
> :-)

I have to agree with Henry that the _principle_ of automatically
causing server errors, that are detected by good clients, to be logged
at the server is a good one.

However there's a danger that server logs will then fill due to
well-meaning but broken clients which _misreport_ bugs which aren't,
in fact, bugs.  Care is needed.

> If you are saying that your first request may have gone down one route, 
> which returned you the bad xml, whereas your ERR went down another 
> route which would have returned you good xml, then that is an 
> interesting point.

That situation _definitely_ exists.  Think of HTTP load balancers and
such.  POSTs for a URI may be handled by different servers than GETs
for the same URI.

I would think _most_ servers would route ERRs for a URI to the same
place that served that URI recently, _most_ of the time.

That is enough to alert humans to a problem on their servers, which is
being detected by clients.

> How could one deal with that situation, if that situation indeed exists?

You let the server administrators figure it out.

If server A is receiving error notifications, but server B is the
cause, then its up to the admins of server A to figure that out and
fix server B, or pass the information on to the owners of B.

> >I am afraid there is no good technical solution here. It is a cultural
> >problem :-(.

I agree.  However there is presently a trend towards making both
clients and servers are tolerant as possible of bad input, and as good
as necessary but not much more output.  There's a mechanism for
alerting people about faulty client requests, and about faults which
are detected at the server: the server error log.  (The same problem
of misattribution applies here, as it could be an error introduced by
a faulty intermediary, however those kinds of logs are still very
useful).

The culture which arises is a natural result of who communicates error
information to whome.

Problems which are logged and visible are fixed.  Hence, clients
workaround broken servers and servers workaround broken clients.  Ok,
that makes for loose coding in parsers and such.  However, clients
which send the wrong thing are more likely to get fixed: because the
result of gross client transmission errors is seen at the client,
soon, by people who will fix it or hassle the people responsible for
the client.

Servers which send the wrong thing are less likely to get fixed:
because the result of gross transmission errors is less likely to be
communicated to the servers.  The clients see it, but the effort of
reporting bugs is such that the people responsible for the server are
not so likely to be informed of it.

This means that important bugs like grossly malformed XHTML will get
fixed at the server: the cost of reporting that kind of bug is low
compared with the benefit of fixing it - and XHTML is fairly well
defined, so that's a benchmark to meet.

But subtler protocol bugs, like the text/xml example and things like
bad protocol syntax and bad HTML (which is not always easy to define)
-- then it's easier to ignore the problem and implement a workaround
at the client, than it is to report problems to the people responsible
for the server.

It's really about ensuring that faults cause some kind of visible
problem -- either a real failure (usually that's not appropriate), or
something logged.  The better errors are detected and logged, provided
they are _real_ errors, not faulty detectors, then the more quickly
those get fixed.

It's true that server output errors should be detected with testing by
those responsible for the server.  But you cannot realistically devote
as many resources to that kind of testing as your client base can.

It's not perfect, but it would be an avenue for catching more errors
than the resources devoted to server-side self checking can.  (And we
know, in reality, that such checking is rarely done thoroughly, if the
server "seems to work".  Yet, responding to errors in logs is more
likely to be done, imho).

It's a bit like compiler warnings.  Compiling with gcc -Wall and then
fixing the code to eliminate the warnings is often helpful.

Btw, that raises a point: What about using the Warning header or one
like it?

For non-fatal client errors, a server could send the client a
"Warning: your client sent badly or inconsistently labelled XML, I
have guessed its encoding (utf-8), and may have guessed wrong; fix
your client".

Similarly, a client could send a warning with its _next_ request to
the same server.  "Warning: the previous response from this server
(foo.org [1.2.3.4] for GET /index.xml) sent badly or inconsistently
labelled XML; I have guessed its encoding (utf-8), and may have
guessed wrong; fix your server".

As Alex points out, that's not guaranteed to reach the same resource
as caused the error.  In fact it's less likely -- as the next request
is likely to be for a different resource.  But it will still cause
(potentially, if they're interested) the error to be logged at the
site, and humans will see it and can figure out what it means.

The ERR proposal has the disadvantage, compared with a Warning header,
that it causes more round trips, it will break some servers, which
aren't expecting ERR (and don't support OPTIONS either, even though
they advertise as HTTP/1.1), and it's asymmetric too: servers should
have a channel for sending non-fatal messages of this kind to clients,
just as clients can send these non-fatal messages to servers (they're
non-fatal by definition in the latter case, due to the stateless
nature of HTTP).

The Warning header, or another header chosen for this, does not suffer
from those problems. :)

-- Jamie

Received on Wednesday, 23 June 2004 13:44:13 UTC