- From: Mark Nottingham <mnot@mnot.net>
- Date: Tue, 26 Dec 2006 23:32:47 +1100
- To: Joe Gregorio <joe@bitworking.org>
- Cc: uri@w3.org
On 2006/12/23, at 5:49 AM, Joe Gregorio wrote: > For both URIs and IRIs the 'reserved' set of characters are the > ones that are going to cause trouble and need to be escaped. Subcompontent delimiters are potentially scheme-specific, so you can't say that they're *always* encoded. For example, HTTP query strings allow '&' and '='; it's only HTML form encoding that adds the additional semantics to those delimiters. From 3986, 2.2: > URIs include components and subcomponents that are delimited by > characters in the "reserved" set. These characters are called > "reserved" because they may (or may not) be defined as > delimiters by > the generic syntax, by each scheme-specific syntax, or by the > implementation-specific syntax of a URI's dereferencing algorithm. Note the "or may not." The problem here is that there are many places that information about whether to encode can come from, and it's very difficult to come up with a generic rule that does the right thing *and* doesn't effectively profile URIs by taking options away. I'd much rather either leave encoding up to the user*, and/or give them some tools that make it easy, but leave the choices in their hands. Cheers, Here, the user might be the party putting the template together, or the party filling it out. -- Mark Nottingham http://www.mnot.net/
Received on Tuesday, 26 December 2006 12:32:43 UTC