- From: Dan Connolly <connolly@w3.org>
- Date: 01 May 2002 16:43:01 -0500
- To: www-tag@w3.org
OK, I've taken a stab at integrating feedback received since 15 Feb: DRAFT Findings on when to use GET to make resources addressable DRAFT by Dan Connolly, for the TAG $Revision: 1.11 $ of $Date: 2002/05/01 21:30:59 $ by $Author: connolly $ http://www.w3.org/2001/tag/doc/get7 plaintext copy follows, for convenience... [1]W3C [2]TAG [3]findings [1] http://www.w3.org/ [2] http://www.w3.org/2001/tag/ [3] http://www.w3.org/2001/tag/findings DRAFT Findings on when to use GET to make resources addressable ref. issue [4]whenToUseGet-7 [4] http://www.w3.org/2001/tag/ilist#whenToUseGet-7 DRAFT by Dan Connolly, for the TAG $Revision: 1.11 $ of $Date: 2002/05/01 21:30:59 $ by $Author: connolly $ Two principles are central to the design of Web sites and applications: * All important resources should be identifiable by URI. * Following references in the web is safe; i.e. agents do not incur obligations by following links It's possible to share information using Web technologies without giving the information a URI, but it's not optimal. For example, a product catalog can be built using an HTML form where the client provides a product number to the server in an HTTP POST request, and information about the product comes back in the response. But that design does not allow the client to make a link to the information about the product, bookmark it, or use it with any of the many Web technologies (e.g., XSLT's document() function, RDF assertions, XLink, ...) that depend on info being URI addressable. HTML forms that use the GET method provide a URI for each combination of inputs. The relevant section of the HTML specification is: The "get" method should be used when the form is idempotent (i.e., causes no side-effects). Many database searches have no visible side-effects and make ideal applications for the "get" method. [5]17.13.1 Form submission method of HTML 4.01 (text has been in HTML spec back to [6]HTML 2.0) [5] http://www.w3.org/TR/1999/REC-html401-19991224/interact/forms.html#h-17.13.1 [6] http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.2 Unfortunately, the term [7]idempotent is misused there, and the term [8]side-effects is stretched from its use in the design of programming languages. The HTTP 1.1 specification is more precise on the matter: [7] http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?idempotent [8] http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?query=side+effect Implementors should be aware that the software represents the user in their interactions over the Internet, and should be careful to allow the user to be aware of any actions they might take which may have an unexpected significance to themselves or others. In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested. Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them. [9]9.1.1 Safe Methods, HTTP 1.1 [9] http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.1 To elaborate on the principal of following links being safe, consider the following two designs for mailing list subscription confirmation: In the first case: 1. The user sends a subscribe message to an administrative mailbox (mylist-request@example.org). 2. The list processing software requests confirmation by email, including a link to a confirmation page 3. The user visits the confirmation page, and finds a "[Confirm] your subscription" form, with method="POST". 4. The user activates the [Confirm] form control. 5. The list processing software confirms the subscription. In the second case: 1. as above 2. as above 3. The user visits the confirmation page and sees "your subscription is confirmed". The list processing software confirms the subscription. The latter design performed an unsafe operation (list subscription) in response to a request with a safe method (following the link from the mail message with GET). If the users's mail agent pre-fetched pages to speed up browsing, the subscription would be confirmed without the knowledge and consent of the user; the HTTP specification makes it clear that the fault is with the server in this case; the user's mail agent is free to follow links without incurring obligations. Obligations of confidentiality, payment, and licensing terms This is not to say that there are never any obligations related to following links; only that the obligations must be accepted some other way than requesting to follow a link. For confidential materials, a straightforward design is: 1. The client requests access to the materials 2. The server declines, with an "authorization required" notice, and a link to an account application form 3. The client follows the link to the form, and applies for an account, agreeing to the terms and conditions in a POST request (or by fax or postal mail, for that matter) 4. The server provides credentials in response 5. The client re-requests the matierials, providing credentials Web sites that say "by following the link to ABC, you agree to XYZ terms and conditions" do not account for the fact that anyone (in particular, a search service) can make another link to ABC, and anyone who follows this other link to ABC may never have seen the terms and conditions. Limitations Web application design should be informed by not only the principles above, but also the relevant limitations. The [10]W3C HTML validation service provides an example: the norm is that validation requests are done by reference; the form uses GET, which gives the results a URI for bookmarks, links, etc; but the service also allows clients to upload a document for validation. In that case, the form uses POST, since * the document to be validated might be confidential; any link to the results of validating it would divulge its contents * a URI that encoded the entire document would be at least as large as the document, and there's little or no use in linking to it, since the results will always be the same [10] http://validator.w3.org/ Whether or not GET with HTTP is used for the initial access, supplying a URI for subsequent access to the same information, e.g., using Content-Location, is useful. Myths and transitional limitations Myth: search services won't index anything with a ? in the URI anyway This was a heuristic to avoid infinite loops in some search service crawlers, but it was not an architectural constraint, and modern search services use more sophisticated heuristics to avoid loops. Myth: URIs cannot be longer than 256 characters This was a limitation in some server implementations, and while servers continue to have limitations to prevent denial-of-service attacks, they are generally at least 4000 characters, and they evolve as the legitimate uses of application developers evolve. Designers of HTML forms that accept non-western characters have been challenged by various implementation limitations and gaps in specifications. For example: The content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters. [11]multipart/form-data in [12]HTML 4.01 [11] http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2 [12] http://www.w3.org/TR/html401/ We expect these limitations to be address in future specifications (@@e.g. XForms?) and deployed in due course. Acknowledgements Thanks to David Orchard, Larry Masinter, Paul Prescod, Roy Fielding, and others for their feedback in response to the [13]15Apr call for review. [13] http://lists.w3.org/Archives/Public/www-tag/2002Apr/0150.html Related work * Neilsen's [14]1997 rant: [14] http://www.useit.com/alertbox/9708a.html There is not much you can do to get users to bookmark your site, except making it possible to do so: no URL-eating frames, and no weird one-time-only links that do not work for subsequent visits. * [15]The Power of the URL-Line By Jon Udell August 20, 2001 * (@@cite stats about the popularity of the back button) * [15] http://www.byte.com/documents/s=1113/byt20010816s0002/0820_udell.html Safety here is regarded as a relative term. Although safety has been defined as "freedom from those conditions that can cause death, injury, occupational illness, or damage to or loss of equipment or property" [MIL-STD-882B 1984], it is generally recognized that this is unrealistic; by this definition any system that presents an element of risk is unsafe. ... Unfortunately, the question of "How safe is safe enough?" has no simple answer. Leveson, Nancy G. [16]Software safety: why, what and how, ACM Computing Surveys, June 1986, pages 125-163. [16] http://doi.acm.org/10.1145/7474.7528 -- Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Wednesday, 1 May 2002 17:42:39 UTC