Re: Computer Misuse Act breaks WebArch (ws Re: Section 5.4.2 of RFC 3986 not actually 'legal' syntax_)

I strongly feel that, except in special cases, doing GETs on any resource 
that interests you should be allowed and even encouraged.  I thought the 
Web architecture was that URI's were to a significant degree opaque, and 
that the way to find out about a resource was to ask it using GET.  In 
some sense, they way you find out whether you're supposed to do the GET is 
by trying.  So, I think doing GETs on any resource should be OK, both 
technically and in terms of social convention, except if your intent is 
malicious.

Dan Connolly writes:

> He could have known about common bugs 
> in servers, and he could have been 
> trying to exploit that bug, or at 
> least test for its presence.

> There was another case of some students that 
> applied to get into a business school 
> (MIT sloan, I think) and they found a
> way to get the web server to get the 
> results of their application before 
> they were supposed to.

Let's consider some non-electronic analogies.  If I go around trying all 
the doors in my neighborhood, social convention dictates that my behavior 
is at best suspicious and probably illegal.  People don't expect you to 
try their doors.  The presumption is that it's in most cases bad behavior.

By contrast, if I walk around the neighborhood reading publicly visible 
signs, or even seeking out such signs and postings to read, my behavior is 
presumed to be benign and legal, even if some of the signs were posted 
inadvertently.  Let's say I stop by a store and read the sign with their 
hours of business.  That's presumed to be an innocent and legal thing to 
do >unless< you can show that I had malicious intent, e.g. I was trying to 
plan a burglary.

The rough analogs on the Web seem to me to be:

* Reading signs is analogous to doing HTTP GETs:  both are encouraged, 
appropriate and presumed benign unless there's bad intent.

* Doing an HTTP GET on a resource you know you aren't supposed to see, 
intentionally exploiting bugs, or retrieving a Web resource to aid in 
planning a crime is analogous to reading the business hours to plan a 
break in.  It's inappropriate because of the bad intent, not because of 
the Web mechanism used. 

* Running a web crawler that manages its peak load on any particular 
resource, site or network is generally OK.  You're just doing a lot of 
walking in the neighborhood, reading lots and lots of signs, but mostly 
not bothering anybody.

* Consciously orchestrating a denial of service attack on a resource is 
roughly analogous to trying to smash someone's door or tearing down their 
signs.  You know you're doing it to break things. 

I think this is a very important point, architurally and socially.  It's 
worth at least some serious discussion.  Whether the TAG and/or the W3C 
should actually do anything formally, and if so what, is a more difficult 
question.  I'm curious to hear what others think before forming an opinion 
on that.

BTW: I believe the business school case involved a number of schools, and 
I don't care to offer an opinion on the hotly debated question of whether 
the students made excusable use of information that was inadvertently 
posted semi-publicly, or showed malicious intent in going beyond an 
implied private property marker.  Certainly the site hosting the grades 
was derelict in not enforcing access controls, which is roughly analogous 
to a business leaving its door wide open after hours.  Maybe or maybe not 
browsing the grades was comparable to walking through the open door and 
looking the business' books, or maybe it was like taking a peek to satisfy 
understandable curiousity.

Noah
 
--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Tuesday, 18 October 2005 02:31:40 UTC