- From: Dan Connolly <connolly@w3.org>
- Date: 01 May 2002 16:43:01 -0500
- To: www-tag@w3.org
OK, I've taken a stab at integrating feedback
received since 15 Feb:
DRAFT Findings on when to use GET to make resources addressable
DRAFT by Dan Connolly, for the TAG
$Revision: 1.11 $ of $Date: 2002/05/01 21:30:59 $
by $Author: connolly $
http://www.w3.org/2001/tag/doc/get7
plaintext copy follows, for convenience...
[1]W3C [2]TAG [3]findings
[1] http://www.w3.org/
[2] http://www.w3.org/2001/tag/
[3] http://www.w3.org/2001/tag/findings
DRAFT Findings on when to use GET to make resources addressable
ref. issue [4]whenToUseGet-7
[4] http://www.w3.org/2001/tag/ilist#whenToUseGet-7
DRAFT by Dan Connolly, for the TAG
$Revision: 1.11 $ of $Date: 2002/05/01 21:30:59 $ by $Author:
connolly $
Two principles are central to the design of Web sites and
applications:
* All important resources should be identifiable by URI.
* Following references in the web is safe; i.e. agents do not incur
obligations by following links
It's possible to share information using Web technologies without
giving the information a URI, but it's not optimal. For example, a
product catalog can be built using an HTML form where the client
provides a product number to the server in an HTTP POST request, and
information about the product comes back in the response. But that
design does not allow the client to make a link to the information
about the product, bookmark it, or use it with any of the many Web
technologies (e.g., XSLT's document() function, RDF assertions,
XLink,
...) that depend on info being URI addressable.
HTML forms that use the GET method provide a URI for each combination
of inputs. The relevant section of the HTML specification is:
The "get" method should be used when the form is idempotent (i.e.,
causes no side-effects). Many database searches have no visible
side-effects and make ideal applications for the "get" method.
[5]17.13.1 Form submission method of HTML 4.01 (text has been in
HTML spec back to [6]HTML 2.0)
[5]
http://www.w3.org/TR/1999/REC-html401-19991224/interact/forms.html#h-17.13.1
[6] http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.2
Unfortunately, the term [7]idempotent is misused there, and the term
[8]side-effects is stretched from its use in the design of
programming
languages. The HTTP 1.1 specification is more precise on the matter:
[7] http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?idempotent
[8] http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?query=side+effect
Implementors should be aware that the software represents the user
in their interactions over the Internet, and should be careful to
allow the user to be aware of any actions they might take which may
have an unexpected significance to themselves or others.
In particular, the convention has been established that the GET and
HEAD methods SHOULD NOT have the significance of taking an action
other than retrieval. These methods ought to be considered "safe".
This allows user agents to represent other methods, such as POST,
PUT and DELETE, in a special way, so that the user is made aware of
the fact that a possibly unsafe action is being requested.
Naturally, it is not possible to ensure that the server does not
generate side-effects as a result of performing a GET request; in
fact, some dynamic resources consider that a feature. The important
distinction here is that the user did not request the side-effects,
so therefore cannot be held accountable for them.
[9]9.1.1 Safe Methods, HTTP 1.1
[9] http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.1
To elaborate on the principal of following links being safe, consider
the following two designs for mailing list subscription confirmation:
In the first case:
1. The user sends a subscribe message to an administrative mailbox
(mylist-request@example.org).
2. The list processing software requests confirmation by email,
including a link to a confirmation page
3. The user visits the confirmation page, and finds a "[Confirm]
your
subscription" form, with method="POST".
4. The user activates the [Confirm] form control.
5. The list processing software confirms the subscription.
In the second case:
1. as above
2. as above
3. The user visits the confirmation page and sees "your subscription
is confirmed". The list processing software confirms the
subscription.
The latter design performed an unsafe operation (list subscription)
in
response to a request with a safe method (following the link from the
mail message with GET). If the users's mail agent pre-fetched pages
to
speed up browsing, the subscription would be confirmed without the
knowledge and consent of the user; the HTTP specification makes it
clear that the fault is with the server in this case; the user's mail
agent is free to follow links without incurring obligations.
Obligations of confidentiality, payment, and licensing terms
This is not to say that there are never any obligations related to
following links; only that the obligations must be accepted some
other
way than requesting to follow a link.
For confidential materials, a straightforward design is:
1. The client requests access to the materials
2. The server declines, with an "authorization required" notice, and
a link to an account application form
3. The client follows the link to the form, and applies for an
account, agreeing to the terms and conditions in a POST request
(or by fax or postal mail, for that matter)
4. The server provides credentials in response
5. The client re-requests the matierials, providing credentials
Web sites that say "by following the link to ABC, you agree to XYZ
terms and conditions" do not account for the fact that anyone (in
particular, a search service) can make another link to ABC, and
anyone
who follows this other link to ABC may never have seen the terms and
conditions.
Limitations
Web application design should be informed by not only the principles
above, but also the relevant limitations.
The [10]W3C HTML validation service provides an example: the norm is
that validation requests are done by reference; the form uses GET,
which gives the results a URI for bookmarks, links, etc; but the
service also allows clients to upload a document for validation. In
that case, the form uses POST, since
* the document to be validated might be confidential; any link to
the results of validating it would divulge its contents
* a URI that encoded the entire document would be at least as large
as the document, and there's little or no use in linking to it,
since the results will always be the same
[10] http://validator.w3.org/
Whether or not GET with HTTP is used for the initial access,
supplying
a URI for subsequent access to the same information, e.g., using
Content-Location, is useful.
Myths and transitional limitations
Myth: search services won't index anything with a ? in the URI anyway
This was a heuristic to avoid infinite loops in some search
service crawlers, but it was not an architectural constraint,
and modern search services use more sophisticated heuristics
to
avoid loops.
Myth: URIs cannot be longer than 256 characters
This was a limitation in some server implementations, and
while
servers continue to have limitations to prevent
denial-of-service attacks, they are generally at least 4000
characters, and they evolve as the legitimate uses of
application developers evolve.
Designers of HTML forms that accept non-western characters have been
challenged by various implementation limitations and gaps in
specifications. For example:
The content type "application/x-www-form-urlencoded" is inefficient
for sending large quantities of binary data or text containing
non-ASCII characters.
[11]multipart/form-data in [12]HTML 4.01
[11] http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2
[12] http://www.w3.org/TR/html401/
We expect these limitations to be address in future specifications
(@@e.g. XForms?) and deployed in due course.
Acknowledgements
Thanks to David Orchard, Larry Masinter, Paul Prescod, Roy Fielding,
and others for their feedback in response to the [13]15Apr call for
review.
[13] http://lists.w3.org/Archives/Public/www-tag/2002Apr/0150.html
Related work
* Neilsen's [14]1997 rant:
[14] http://www.useit.com/alertbox/9708a.html
There is not much you can do to get users to bookmark your site,
except making it possible to do so: no URL-eating frames, and no
weird one-time-only links that do not work for subsequent visits.
* [15]The Power of the URL-Line By Jon Udell August 20, 2001
* (@@cite stats about the popularity of the back button)
*
[15]
http://www.byte.com/documents/s=1113/byt20010816s0002/0820_udell.html
Safety here is regarded as a relative term. Although safety has
been defined as "freedom from those conditions that can cause
death, injury, occupational illness, or damage to or loss of
equipment or property" [MIL-STD-882B 1984], it is generally
recognized that this is unrealistic; by this definition any system
that presents an element of risk is unsafe. ... Unfortunately, the
question of "How safe is safe enough?" has no simple answer.
Leveson, Nancy G. [16]Software safety: why, what and how, ACM
Computing Surveys, June 1986, pages 125-163.
[16] http://doi.acm.org/10.1145/7474.7528
--
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Wednesday, 1 May 2002 17:42:39 UTC