Re: URI and IRI Templating (What did I get myself into?)

On 2006/12/23, at 5:49 AM, Joe Gregorio wrote:
> For both URIs and IRIs the 'reserved' set of characters are the  
> ones that are going to cause trouble and need to be escaped.

Subcompontent delimiters are potentially scheme-specific, so you  
can't say that they're *always* encoded. For example, HTTP query  
strings allow '&' and '='; it's only HTML form encoding that adds the  
additional semantics to those delimiters.

 From 3986, 2.2:

>    URIs include components and subcomponents that are delimited by
>    characters in the "reserved" set.  These characters are called
>    "reserved" because they may (or may not) be defined as  
> delimiters by
>    the generic syntax, by each scheme-specific syntax, or by the
>    implementation-specific syntax of a URI's dereferencing algorithm.

Note the "or may not." The problem here is that there are many places  
that information about whether to encode can come from, and it's very  
difficult to come up with a generic rule that does the right thing  
*and* doesn't effectively profile URIs by taking options away.

I'd much rather either leave encoding up to the user*, and/or give  
them some tools that make it easy, but leave the choices in their hands.

Cheers,

Here, the user might be the party putting the template together, or  
the party filling it out.

--
Mark Nottingham     http://www.mnot.net/

Received on Tuesday, 26 December 2006 12:32:43 UTC