- From: Dave Bridger <dbridger@inlink.com>
- Date: Sun, 9 Apr 2000 21:45:48 -0500
- To: <www-html@w3.org>
>Dave J Woolley wrote: [snip] > "Space characters are replaced by `+', and > then reserved characters are escaped as described in [RFC1738], section ^^^^^^^^ ^^^^^^^^^^ >2.2: > Non-alphanumeric characters are replaced by `%HH', a percent >sign and > two hexadecimal digits representing the ASCII code of the character. > Line breaks are represented as "CR LF" pairs (i.e., `%0D%0A'). > > -- 17.13.4 Form content types >http://www.w3.org/TR/1999/REC-html401-19991224/interact/forms.html#h-17 .13.4.1 >That's clear enough, no? > 0. convert mac/unix/whatever linebreak conventions to internet CRLF > if necessary > 1. replace all ' ' by + > 2. replace everything but alphanumerics [a-zA-Z0-9] by %HH Ahh...But therein lies the confusion... Set aside the question of whether or not you % escape the + with which you replaced the spaces (first point of confusion). If you read the referenced RFC1738, you discover that the term "reserved characters" has a special meaning. Incidentally, the RFC1738 reference is out of date since it has been superceded by RFC2396. From RFC2396, section 2.2. Reserved Characters: ... reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," ... From section 2.3. Unreserved Characters: ... unreserved = alphanum | mark mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" Unreserved characters can be escaped without changing the semantics of the URI, but this should not be done unless the URI is being used ^^^^^^^^^^^^^^^^^^^^^^^ in a context that does not allow the unescaped character to appear. ... So now we have the second point of confusion--should the "mark" characters be escaped or not. If yes, eliminate the reference to the RFC and simply escape all non-alphanumeric characters. If no, reword the text to make clear what is meant by reserved characters. Since the RFC is permissive, there is no problem if the first choice is made, but in either case section 17.13.4 should be clarified. --Dave Bridger
Received on Sunday, 9 April 2000 22:45:42 UTC