W3C home > Mailing lists > Public > site-comments@w3.org > June 2009

Re: HTTP 503 Error

From: Ted Guild <ted@w3.org>
Date: Tue, 16 Jun 2009 13:46:00 -0400
To: Tymon Wiedemair <tymon.wiedemair@gmail.com>
Cc: site-comments@w3.org
Message-ID: <nnbvdmwhy1z.fsf@dev-null.guilds.net>
Tymon Wiedemair <tymon.wiedemair@gmail.com> writes:

> I have a Java XML parser in place to parse xml/xhtml documents. The
> xml documents refrence a DTD hosted by you.
> Since a few days I always get back a HTTP 503 error code from your
> site. The document loads but still there is this 503 error which
> causes trouble in the parser.

We are sending HTTP 503 and the content of the response also includes a
link which expands to an article giving more background on this issue.

http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic

In the last 16 months since writing that article we have only seen
this traffic increase and recently we are seeing surges in traffic
that we cannot keep up with, neither our automated defenses nor manual
intervention.  Increasing server capacity sees the increased capacity
just getting consumed as well.  This is rendering our site overwhelmed
and unresponsive for our working groups and the rest of the web
community.

Some IP addresses we firewall temporarily due to the volume.  That
happens automatically and are cleared after a few days.

About 1/4th of our DTD traffic (in the hundreds of millions/day) is from
Java so when trying to keep our site available yesterday responding 503
to this traffic was low hanging fruit.  We will be monitoring this
traffic and see when we can be less dramatic in our defenses.  We have
relaxed our blocking of datatypes.dtd but do note depending on volume
access may still be blocked so use a cache or catalog.

We have also identified another widely distributed application
responsible for a substantial portion of this traffic, the vendor has
acknowledged the issue and is working on a resolution which we hope
will be released soon.

Many libraries have catalog or caching options and lacking that one can
get a caching proxy in front of their application making repeated DTD
requests.  You should also see a pronounced performance improvement in
using a catalog or cache instead of repeatedly going over the internet
for these DTD resources.

For Java Glassfish is an option:

http://norman.walsh.name/2007/09/07/treadLightly

and apparently if using Apache libraries there is a catalog solution in
it as well as mentioned in this article.

http://nwalsh.com/docs/articles/xml2003/

Without touching the code you can setup a caching proxy (eg Squid) and
request the DTD resources through it from a user-agent other than one
that vaguely identifies itself as Java.  Your application[s] will then
reference the local resource from the proxy.

-- 
Ted Guild <ted@w3.org>
W3C Systems Team
http://www.w3.org
Received on Tuesday, 16 June 2009 17:46:07 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 24 October 2012 16:21:31 GMT