W3C home > Mailing lists > Public > www-international@w3.org > October to December 2002

Re: http status code for site blocked

From: Tex Texin <tex@i18nguy.com>
Date: Mon, 09 Dec 2002 19:21:04 -0500
Message-ID: <3DF53370.D9E9251B@i18nguy.com>
To: Barry Caplan <bcaplan@i18n.com>
CC: www-international@w3.org


I disagree on almost all counts and would like to hear from others.

1) The cause may be at a level below the web, but nonetheless is a
problem for the web. It is a problem for other layers and technologies
as well, but I am addressing a concern for the web here.

2) The responsibility for emitting a blocked code clearly does not fall
to the server, but to the node that is opting not to forward the request
to the server and instead reject it. Somewhere along the way a node
decided to return "not found" instead of forwarding the request.
I presume that however that node was made to reject that particular
request, it could be made to return a different status.

3) Thanks for the reading material. I am pursuing it. I am aware of the
analogies to other media and technologies. There are also important
differences. There is a web impact and it is pertinent to discuss here.
I hope some others will pipe up.

4) You commented on my search engine example. It was intended only as an
example of how blocking should perhaps be taken into account into web
design. It is not useful to have a search engine return results in which
many of the links and even cached pages are not accessible. So
optimizing the results to show accessible contents and primarily
contents which in turns references other accessible contents is
Your comment is it is difficult for search engines to do this with
up-to-date information. I am not sure if that is realistic requirement,
since engines don't return up-to-date pages today anyway. But lets not
dwell too much on the example's implementation and keep to the more
strategic issues.

5) Tunneling- fine if there are workarounds although I expect censors
will work to close the loopholes. It doesn't sound like a scheme which
the web or web services can come to depend on as a scalable solution.

6) Yes your analysis was inconclusive as far as it went. Others
confirmed the site is blocked. In any event, my site is irrelevant. I am
not raising the issue for that reason. The only way that it is relevant,
is that it woke me up to the fact this is a problem.

7) Guidelines- I am not sure why you remarked about self-censorship.
Guidelines are not enforced restrictions- they are just suggestions you
can follow or not. I already frequently modify content in such a way to
improve their rankings among search engines even though it is not the
way I would write the page otherwise. If I can make harmless changes
that remove objectionable pieces and allows the content to have greater
access and exposure it is a good thing. If I think the change is not
harmless then I can choose not to follow the guide. I prefer to be
informed.  (Well ok. I sometimes have twinges of guilt when I
intentionally put a typo in a web page so the search engines will also
find the page when a common misspelling is sought. ;-) )

8) Unblocking- You presumed graft-laden corrupt officials... Right now
we don't even have someone to bribe.
Seriously Barry, let's first suggest there be a vehicle for discussing
blockages and then we can try to make it fair as best we can.

OK, you don't see it as a web issue and don't like the imagined
politics. I think that the web could perhaps be improved if designed to
proactively take blocking into account and at least had some
recommendations as to how it should/could work. I am not arguing for or
against the politics, or trying to change them, just accepting that it


Barry Caplan wrote:
> Tex,
> This is not a "web" problem per se. It happens at the IP level or even below that in manipulation of routing protocols. In a technical sense, there is nothing "broken", just standards being abused.
> Because it happens at a lower level, there is nothing the HTTP protocol can do to sense this, and hence nothing a server can do to respond a "I am blocked" message. As others pointed out, there is little if any incentive for the blocker to tell you they are blocking or why. I think by the "web is designed to be a single app" you mean HTTP but of course the web is in practice more than that these days.
> There is an enormous amount of discussion of this in technical/policy related lists. Others have pointed you towards the Berkman Center in your neighborhood, also Declan McCullagh's Politech List, virtually anything else Open Source sw and its policy implications are discussed, EFF.org, probably Lawrence Lessig's blog, RISKS Digest, and so on.
> I highly recommend Lessig's books, Code, and the Future of Ideas for anyone interested in these matters.
> I am not sure w3c should get into it because:
> - their policies are already open
> - they are concerned primarily with the web and this is not a web-specific issue
> - I think they take a pretty neutral position wrt actual implementations (I could be wrong) and that seems like a good idea.
> BTW, if you track some of the info above, you will discover this issue is closely related to others much closer to home. Not the least of which is the emerging monopolies in broadband access and the lack of regulation. If you have cable or dsl, check to see if your ISP has prohibitive or open policies wrt use of their pipes. Like if you are artificially limited in bandwidth, (YES !) contractual limitations prohibiting server applications, contractual obligations regarding reselling or even sharing of your service, by wireless or otherwise.
> While you have them on the horn, ask them if they would even remotely consider giving priority at the IP level to a class of applications, or destinations,or customers. The implications of that are blocking, artificial delays, and paying for access to a full implementation of current standards.
> All of this is implemented in exactly the same way, at the same level at the country firewall blocking. Very locally, not just in exotic locales.
> At 04:40 PM 12/9/2002 -0500, you wrote:
> >For example, I can imagine that several web behaviors should be changed
> >to maximize accessible links depending on the location of the viewer.
> >Search engines could rank pages taking into account pages containing the
> >fewest blocked links depending on the user's vantage point.
> This presumes that search engines or anyone else has up-to-date information, which is a big stretch.
> >Clearly, the single web is preferable in design and practice. But for
> >some parts of the world if this cannot be achieved we should be
> >considering alternatives that behave better for them.
> Variations on tunnelling in and out are what people do in countries where they know blocking happens or for other anonymity reasons. Your Chinese colleagues were configured for this probably,but it was risky for them to tell you how directly. Now that you are back, if you think it might be problem again, you can research and configure it now. google can probably point you towards the sw you need, it is probably going to be found on sourceforce.com somewhere.
> >2) Publishers and (web) service providers don't know if they are
> >reaching their intended audiences. I can invest millions in a Chinese
> >web site and not know if it will be viewable in China. Further, people
> >in China may not be able to tell the investor of the problem. If your
> >domain is blocked, e-mail cannot be sent to you.
> That is why there are sites that will tell you if a site is blocked (I think Berkman has one), and technical techniques for the savvy to get around blocks. this is a very active area both in software R&D and policy.
> BTW, web site blocking at the domain level is a different issue from blocking email at the domain level. One requires filtering http, the other various email protocols. So just because one is happening does not mean the other is.
> >3) You may be accessible one day and blocked the next. You will not be
> >informed of the change in status.
> You may be blocked at a page level. And it could be minute to minute.
> >4) There is no process for reversing a blockage or even discovering why
> >you may have been blocked.
> One can usually guess. We did partial network analysis on your issue, and results were inconclusive. We couldn't tell if 1) you were blocked, or if 2) there was a router configuration issue, or if 3) the router you were passing through to exit china was blocked at the next hop because it was a well know exit point for spam.
> Given the apparent innocuous nature of your site, I would guess it was 3-2-1 in order of likelihood the reason you couldn't reach your site, with 1 being very small.
> >5) Just to be clear, sites being blocked are not necessarily political
> >or controversial in any way. Some authorities block until they have a
> >reason to unblock.
> Evidence?
> >I would like:
> >
> >a) a group in the W3C to address this (to).
> I would prefer w3c not become politicized like this - there are already groups in a better position to influence policy.
> >b) an http code for blocked. (I will take it up with IETF).
> May have already been proposed and rejected. It doesn't make sense to me because the message would originate at an IP other than the web server. I doubt it HTTP allows for that. At a lower level it is arguably an abuse of IP itself. Anyway, firewalls of all sorts routinely block IP packets by sending them to the bit bucket and make it look like they went through by not responding at all.
> Perhaps there is a way to tell if the IP packets originated at the requested server, but even if so, that is surely spoofable.
> I recommend Lessig's discussion of end-to-end applications in "The Future of Ideas" for a non-technical overview of this problem.
> >c) W3C recommendation(s) for working with blocked URIs- e.g. ways to
> >distinguish being blocked from other kinds of failure and
> >appropriate behaviors such as an image to be displayed if images are
> >blocked, etc. .
> I am no expert on routing protocols, but I don't see how you could distinguish a routing level block from a routinely misconfigured router.
> And to throw another monkey wrench in the works, there are companies like Gator who aim to prevent what you send from reaching its destination on the basis that it isn't really what the requester wanted and they (Gator) can decide to replace the content with something else,presumably on your behalf.
> >d) Guidelines for not offending censors. I suspect there are non-obvious
> >offenses to warn about.
> This would be the worst overall policy I can imagine. This would amount to self-censorship imposed by technical standards committees.
> >e) Guidlelines for getting unblocked. Perhaps soliciting/promoting the
> >creation of organizations within censoring bodies for the purpose of
> >accepting appeals to be unblocked.
> Sorry to be blunt, but this is bad policy too. Kissing ass of corrupt and even friendly governments to let data through is nothing short of supplication. Can you imagine the graft involved, with no guarantees of results since the policies are arbitrary in the first place.
> Much better to continue to build and propagate easy to use and free (as in speech and probably as in beer) software that works around the problem.
> >I did want to promote the idea that blocking is not simply a fait
> >accompli that should just be accepted as-is.
> You should be outraged enough to look into some of the sources I listed above - and I am serious about the list of questions for your own broadband provider - you may be quite surprised at the answers you discover.
> If we don't examine it more closely then there isn't one world wide web,
> >there are multiple regional webs.
> >
> >Is this worth pursuing? If so, where is the best place to take this up?
> Lessig. Lessig. Lessig. Familiarize yourself with his books first and the rest will become clear :)
> The more I read and learn about this, the more I am reminded of the 1960's and the ways individuals became aware enough was enough. The framework for everything you ask about already exists - it is up to each of us to learn about it and join in and continue to enable it.
> Barry Caplan
> www.i18n.com

Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
Xen Master                          http://www.i18nGuy.com
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
Received on Monday, 9 December 2002 19:21:12 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:47 UTC