W3C home > Mailing lists > Public > uri@w3.org > September 2008

Re: URI Templates: done or dead?

From: Mark Nottingham <mnot@mnot.net>
Date: Wed, 17 Sep 2008 10:03:28 +1000
Cc: URI <uri@w3.org>, Joe Gregorio <joe@bitworking.org>, David Orchard <orchard@pacificspirit.com>, Marc Hadley <Marc.Hadley@Sun.COM>
Message-Id: <C3465E84-DDA9-4BDA-BCF0-8DC1C10A3CD5@mnot.net>
To: Roy T. Fielding <fielding@gbiv.com>

On 16/09/2008, at 12:28 PM, Roy T. Fielding wrote:

> I won't use it in its current state because it isn't finished yet.
> The prose is, at best, an outline.  The operators aren't even defined
> in words -- the reader has to guess why they exist.  The examples seem
> to be obsessed with the most irrelevant corner cases instead of  
> teaching
> the common cases first.  And it is far too focused on python language
> as a means of definition.  None of these are technical issues.

Yes, there are a number of editorial issues in -03, but some technical  
decisions need to be made (or the current design agreed to) before  
they can be addressed.

>> I believe there are a few things we can do to make URI Template  
>> more broadly useful and useable, without sacrificing too much  
>> functionality (at least in the 80% case).
>> 1. Reduce or drop operators.
>> As mentioned above, they don't read well; they're obviously  
>> intended for machines, not people. The expansion for a template  
>> should be blindingly obvious, but the operator syntax seems to want  
>> to get in the way rather than help. Furthermore, the vast majority  
>> of use cases for templates are for simple template substitution,  
>> not operations like 'neg' and 'opt'.
> Actually, the vast majority use case is unordered form key=value
> substitution.  Complete path segment replacement is second, followed
> by URI inserts ("insert this value without further encoding").

After simple string substitution, yes.

>> 2. Drop list values.
>> Again, the majority of use cases out there have no need for list  
>> values in template variables, and including them in the spec  
>> significantly complicates things.
> I think it is complicated because the introduction of list-only
> operators (typed functions) is unnecessary.  Complex values can be
> addressed in an orthogonal manner when the value is substituted,
> mainly by defaulting to the most common form, and more complex
> behavior can be defined only when applicable (i.e., a prefix on
> the variable name can indicate how to translate a list into
> numbered parameters or even associative array key=value sets).
> The important thing to note is that compound values are only
> interesting when templates are embedded within computer language
> processing, so we could easily allow such things to be language
> specific by reserving non-alphanumeric prefixes on variable names
> for that purpose.

I agree that typed functions are unnecessary. I can live with  
accommodating lists in the 'standard' operators as long as their  
impact is minimal.

>> 3. Make percent-encoding context-sensitive.
>> There are just too many cases where the 'escape everything but  
>> unreserved' rule gets in the way; for example, if my template is "http://example.com/user/ 
>> {email}", I'm going to have percent-encoded @ signs in my URIs  
>> whether I like it or not -- even though they're not required to be  
>> percent-encoded there. This is a relatively simple thing to do, as  
>> long as we also...
> URI inserts could do that.  E.g., use {+email} instead of {email}.

The area that I'm concerned about here is the case where a template is  
supplied to someone who inserts a variable into it, and the variable  
can appear anywhere.

For example, if the value "email" is an e-mail address, "nottingham/mark@example.org 
", and the following template is supplied;
which would expand to

whereas the template
would expand to
which isn't what's desired;

Now, a perfectly legitimate answer is that people should just decode  
the %40 in the first expansion <http://example.net/nottingham%2Fmark%40example.org/ 
 > and move on. I'm not yet comfortable doing that, however, for two  

a) from what I've seen, by far the most common case for the use of  
templates is when there is loose coordination between the parties  
generating the templates and doing the variable interpolation, and  
therefore it's difficult to arrange proper encoding out-of-band.

b) while the spurious percent-encoding doesn't matter in some use  
cases (usually, on the server when they're processing the URI), it  
does in others (when the URI is being used as an identifier, and  
therefore compared character-for-character).

>> 4. Allow exceptions to percent-encoding.
>> We need a syntax that allows characters to be excepted from  
>> encoding, even in context. As a straw-man, I suggest preceding the  
>> expression with the characters that are excepted, like:
>>   http://example.com/{/path}
>>   http://example.com/thing{?&=query_args}
>> and so forth.
> That is much more complex.  Dynamically changing the transcoding
> algorithm is far more expensive than just using a different operator
> for non-encoded insertion.

It's more complex, yes, but I wouldn't say it's "far more expensive."  
I'm not going to lie down in the road on this one, I just think it's  
going to limit the use cases for templates pretty substantially if we  
don't address what's above.

>> 5. If we keep operators at all, mint special ones for the common  
>> cases.
>> E.g., something to handle encoded form query values "out of the box":
>>  http://example.com/thing{-?a=foo&b=bar&c=baz}
>> and likewise with matrix parameters.
> Something like
>    var   = "value";
>    undef = null;
>    empty = "";
>    list  = [ "val1", "val2", "val3" ];
>    keys  = [ "key1", "val1", "key2", "val2", "key3", "val3" ];
>    path  = "/foo/bar"
>    x     = "1024";
>    y     = "768";
> {var}                     value
> {var=default}             value
> {undef=default}           default
> {var:3}                   val
> {x,y}                     1024,768
> {?x,y}                    ?x=1024&y=768
> {?x,y,empty}              ?x=1024&y=768&empty=
> {?x,y,undef}              ?x=1024&y=768
> {;x,y}                    ;x=1024;y=768
> {;x,y,empty}              ;x=1024;y=768;empty
> {;x,y,undef}              ;x=1024;y=768
> {/list,x}                 /val1/val2/val3/1024
> {+path}/here              /foo/bar/here
> {+path,x}/here            /foo/bar,1024/here
> {+path}{x}/here           /foo/bar1024/here
> {+empty}/here             /here
> I think the above covers all of the common cases without making
> the uncommon cases impossible.  The common case is that the delimiters
> (";", "?", and "/") are omitted when none of the listed variables are
> defined, which matches good URI practice.  Likewise, the substitution
> handler for ";" (path parameters) will omit the "=" when its value  
> is empty,
> whereas the handler for "?" (form queries) will not omit the "=".
> Multiple variables and list values have their values joined with ","
> if there is no predefined joining mechanism for the operator.
> I think this mechanism is simple and readable when used with simple
> examples because the single-character operators match the URI generic
> syntax delimiters.  Only one operator inserts unencoded values; all
> of the others encode any characters other than unreserved.

+1, and I'd specifically like to see Joe's response to this proposal.

> The mechanism does become harder to read when we do very unusual
> things and add all the bells and whistles, like
> {var,undef,empty,list}    value,,val1,val2,val3
> {/var:3,undef,list,empty} /val/val1/val2/val3/
> {;var,undef,empty,list}   ;var=value;empty;list=val1,val2,val3
> {?var,undef,empty,list}   ?var=value&empty=&list=val1,val2,val3
> {?var,undef,empty,@list}  ? 
> var=value&empty=&list1=val1&list2=val2&list3=val3
> {?var,undef,empty,%keys}  ? 
> var=value&empty=&key1=val1&key2=val2&key3=val3
> but we don't need to care if complex cases are hard to read.


Mark Nottingham     http://www.mnot.net/
Received on Wednesday, 17 September 2008 00:04:09 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:12 UTC