Re: http status code for site blocked from Barry Caplan on 2002-12-10 (www-international@w3.org from October to December 2002)

From: Barry Caplan <bcaplan@i18n.com>
Date: Mon, 09 Dec 2002 17:40:24 -0800
To: Tex Texin <tex@XenCraft.com>
Cc: www-international@w3.org
Message-Id: <5.1.0.14.2.20021209163507.0baf89a0@mail.i18n.com>
At 07:21 PM 12/9/2002 -0500, Tex Texin wrote:

Tex,

>Barry,
>1) The cause may be at a level below the web, but nonetheless is a
>problem for the web. It is a problem for other layers and technologies
>as well, but I am addressing a concern for the web here.

My point was I believe that there is a limitation of HTTP and other web protocols riding on top of TCP/IP. By the principle of information hiding, higher level protocols simply do not have the information they would need to do what you ask. 

However, wherever there is HTTP, there is TCP/IP, but not vice versa. The solution lies below HTTP, and web servers simply do not have that level of information available to them.


>2) The responsibility for emitting a blocked code clearly does not fall
>to the server,


That's not so clear to me. I haven't read the HTTP protocol RFCs, but I would be surprised if it did not say exactly that all response codes need to come from the server handling the request. After all, if it didn't fall there, then where would it fall? 

> but to the node that is opting not to forward the request
>to the server and instead reject it. Somewhere along the way a node
>decided to return "not found" instead of forwarding the request.

That is because the node is acting as a de facto (and presumably) http proxy server - it sees http traffic on port 80 and acts up it instead of passing on as it should. 

BTW, if you allow a middleman to handle a request unbeknownst to you, then what is to prevent the middle man from returning any information it wants aside from the response code you suggest? 

If any router could and did that then there would be know way short of encrypting every single transaction that what you got back came from where you think it did. Which might not be such a bad idea anyway...

>I presume that however that node was made to reject that particular
>request, it could be made to return a different status.

True, but why would it? Unless it is in the interests of the blocker, why would it advertise itself? In general, I don't see how advertising is in the interest of the blocker, although it would be in the interest of the blockee. The UAE example was interesting, but only how it chose to respond by twisting the knife even further.


>3) Thanks for the reading material. I am pursuing it. I am aware of the
>analogies to other media and technologies. There are also important
>differences. There is a web impact and it is pertinent to discuss here.
>I hope some others will pipe up.

I hope so too!

For the record of all the readers, all the examples I gave deal with the Internet and associated digital technologies and policies.




>4) You commented on my search engine example. It was intended only as an
>example of how blocking should perhaps be taken into account into web
>design. It is not useful to have a search engine return results in which
>many of the links and even cached pages are not accessible. 

I disagree. 

That might be available as a user preference ("don't show me currently blocked sites/don't show me currently unblocked sites/show all sites"), but the search engine need not presume what is best for the searcher. In fact, there are enough other intellectual property issues for a search engine to deal with that they probably don't want to play censor at all if they can help it.


>So
>optimizing the results to show accessible contents and primarily
>contents which in turns references other accessible contents is
>desirable.


As an option on your search, if that is what you want to do, I can live with that - but the default should always be "show all". After all, the search engine does not check to see if a web server is up or if the domain even still exists before displaying results, and no one has a problem with that AFIK.

5) Tunneling- fine if there are workarounds although I expect censors
>will work to close the loopholes. It doesn't sound like a scheme which
>the web or web services can come to depend on as a scalable solution.

Actually, my understanding is if there is a single port in/out of a country, for any service, it can be used to tunnel data in an encrypted fashion unseen and transparently to the outside and back. So unless there is no outside access at all, tunnelling works. I am not sure of the exact details though but I think this is the general idea. 


>6) Yes your analysis was inconclusive as far as it went. Others
>confirmed the site is blocked.

What additional information was there that ruled out the other two options? (even if someone somewhere in China couldn't reach the site, that doesn't mean it was blocked). Off-list is fine for this one...


> In any event, my site is irrelevant. I am
>not raising the issue for that reason. The only way that it is relevant,
>is that it woke me up to the fact this is a problem.


I understand that. Sometimes politics has to get personal first :)


>7) Guidelines- I am not sure why you remarked about self-censorship.
>Guidelines are not enforced restrictions- they are just suggestions you
>can follow or not. I already frequently modify content in such a way to
>improve their rankings among search engines even though it is not the
>way I would write the page otherwise. 

These are not modifications made to mollify governments though.


>If I can make harmless changes
>that remove objectionable pieces and allows the content to have greater
>access and exposure it is a good thing. 

Objectionable and harmful to who is the issue. Yahoo is fighting a current battle about French courts disallowing Nazi memorabilia for sale on their French site. The principle is that if laws or even "strong suggestions" are enforceable across borders then it will be the most restrictive that carries the day. this simply must not be allowed to happen. 

Similarly US Congress has been stopped multiple times by the US Supreme Court from implementing laws that would effectively ban content on the net (at least in the US) that is not appropriate for children. The idea that if the old pre-net principle of "neighborhood standards" were to apply to the Net, then the most restrictive neighborhood would set the limits. Arguably, the whole world is the neighborhood on the net, and there is no consensus on what is appropriate. Hence it should be up to the recipient of the data to figure out how to deal with it.

8) Unblocking- You presumed graft-laden corrupt officials... Right now
>we don't even have someone to bribe.
>Seriously Barry, let's first suggest there be a vehicle for discussing
>blockages and then we can try to make it fair as best we can.

I suppose if you knew the right people you can get things unblocked. In the same way that Mexican traffic tickets are taken care of on the spot in cash.

I am being serious. The issues you raise are much bigger than the Web - they affect every protocol above IP.

This is good example of the limits of sovereignty. This is not an issue for sovereigns to decide. As Lessig describes in "Future of Ideas", the Internet's architecture is for end to end transport, with the smarts happening at the ends (server/client usually) and the middle serving merely as the way to get information there and back. 

Now that is not to say you couldn't build a network on other principles, just that such a network is not the INTERNET. This is probably technically so obvious to the readers on this list that its profoundness is not so obvious.

The examples I listed - artificial bw limitations, blocking by government agents at borders, wireless limitations, etc. are all examples of putting smarts in the middle instead of at the ends. If individuals at the ends choose to implement those limits on themselves, hey go ahead. This is what a lot of spam blocking, "parental blocking" and antivirus software does. But the effects should not extend to the middle and effect other "ends" on the network unless it is specifically requested by all of the ends in question. To do otherwise is to redefine what the INTERNET is, and I don't recall anyone having that discussion and saying "OK by me if you change all the principles of the early RFCs that the whole net is built upon."



>OK, you don't see it as a web issue and don't like the imagined
>politics. 

It is a IP /routing issue that affects all higher protocols, not just the web.

These politics are not imagined. 

The battles are clearly joined on the policy and technical sides world-wide. You will find there are a great many great minds working on many aspects of this issue already and have been for a long time - I think the first time it reached the public consciousness was in the days after Tianeman among Chinese needing to send email in and out of China. 

There are criminal trials right now in San Jose of a Russian and in Norway of a 19 year old local who was 15 at the time of charging. These are people who would just use what they already who have run afoul of those in whose interests it is to change the underlying end-to-end structure of the net (Digital rights management in both cases). These are very real and very political. 

>I think that the web could perhaps be improved if designed to
>proactively take blocking into account and at least had some
>recommendations as to how it should/could work. I am not arguing for or
>against the politics, or trying to change them, just accepting that it
>exists.

To do that would require either routing around a block, tunneling through it, or trusting that the block will tell you the truth. In the first case, there may be no other way in. In the second, it works today with proper software configuration. In the third, I don't think you can trust it, and even if you could, you are still left with either of the other 2 choices or giving up.

Barry Caplan
Received on Monday, 9 December 2002 20:38:52 UTC