Re: http status code for site blocked from Barry Caplan on 2002-12-16 (www-international@w3.org from October to December 2002)

From: Barry Caplan <bcaplan@i18n.com>
Date: Mon, 16 Dec 2002 11:18:51 -0800
To: "L. Turetsky" <lenny@aya.yale.edu> (by way of Martin Duerst <duerst@w3.org>), www-international@w3.org
Message-Id: <5.1.0.14.2.20021216104903.08de2558@mail.i18n.com>
At 11:58 PM 12/16/2002 +0900, L. Turetsky wrote:
>So I heartily agree with Tex's call for an HTTP code to be returned by the blocking firewall/router/whatnot. And that code should be followed by alternate content, such as what Toby described happens as his UAE ISP.

Go ahead and add it into HTTP 1.2 - bit don't hold your breath getting it implemented - HTTP 1.1 is ossified already.

And even if it is in such protocol, it is still a higher level protocol that can easily be ignored by anyone.

>Barry is right that having a proxy return a status code violates the HTTP specification that the response code come from the server in the URL, but that's an academic point. 


It is not a academic point. It is a fundamental precept of how HTTP works. I was not present, but I am sure this matter was discussed and rejected during the creation of HTTP in the first place. If it were possible to do on top of TCP/IP, it would have been done, but it wasn't so it wasn't.

Since TCP/IP has not changed, it doesn't seem worth going back to revisit this.

>Perhaps the return code should include a URL identifying the blocker (e.g., the IP address of the router/firewall, the name of the blocking software, the censor's name and address, what-have-you...) 


Just because a HTTP request did not go through does not mean it was blocked.

There exist network analysis tools, including but not limited to traceroute, that can be used to narrow down the reason for a packet (any packet!) to make it to its destination. 

there may already be Apache modules that would run these tools on any packet from a request that timed out in order to report on the reason. But if not, it might be better to build one instead of being in denial about what http can and can't do.

>In practical terms this would permit children to complain to their parents about inappropriately blocked sites, or employees to inform their network admins of such situations. 


Which brings up the reason why people block, including China or public libraries in Mississippi - they *do not want to be held accountable*. They probably don't care if you know if you block - they just don't care - their only goal is to stop the traffic. Since it is not blocked by mistake, the decisions are not open to reconsideration.

As for "parental" blocking such as Net Nanny, that happens at the client side and is something very different. The Net is full of descriptions about how much effort the block list providers put towards protecting the list from decryption or other analysis. They don't care either if a site is blocked - they probably already allow individual users to create a white list if they know of a site they can't reach that they want to. 

>Perhaps the corporate firewall should block access to www.playboy.com, but it would be good to allow employees to protest the blocking of a particular page which has an interview with a competitor's CEO or whatnot.

What stops them form protesting now? They can surely send an email to someone. The truth is they don't understand the way the network transport protocols work. If they did, then they don't need a new return code, and their email would be very useful in determining what the real reason for not being able to reach a site is. As I said, it is not always blocking.

If you want everyone to be able to do that, then write the Apache modules I described above, and make sure every request either terminates at such a server or originates at a proxy server so equipped. Then the error page could be "Cut and past this ands send it to your IT guy...."


>The internet was designed with the idea of free flow of information from end-to-end as Lessig says, but that's when it was designed by DARPA for the use of the US only. The fact that intermediate point can block a connection is proof that the end-to-end philosophy doesn't really hold.

The philosophy has held just fine as the Net has expanded from 3 nodes in California to untold trillions around the world. There are ample other types of networks that have different philosophies of design and operation. 

That an intermediate point can block is not proof of a failure of philosophy. that an intermediate point *does* block is failure to participate according to both the spirits and the rules. If this becomes widespread, then the resulting network will no longer be "The Internet" but instead  just another network that is of less general utility than the Internet is.


>Barry also wrote:


>True enough. Unfortunately, the People's Republic of China considers itself to be the operator of an "autonomous network", as do AOL,


AOL is upfront about running non-standard, proprietary software. It is still an overgrown BBS system with a internet gateway. For instance, there is not even SMTP mail available to AOL users. No one at AOL would deny that.

If the Chinese govt or anyone else running a blocking firewall is willing to say loudly and clearly to all, both in front of and behind the firewall, that what is behind the firewall is not part of the internet proper, but is merely a different type of  network gateway'ed to the internet, well, I would love to see that.

Technically, this is what Tex is asking for, but the politics are still such that it is not in anyone's interest to do so - even AOL obfuscates to its newbie users who in large part think AOL *is* the internet, even though they rarely use even the limited browsing capabilities available to them.

Can someone explain why China would use a protocol that explains the existence. let alone, the reason, for blocking?

Barry Caplan
Received on Monday, 16 December 2002 14:17:08 UTC