Re: URI Templates, percent-encoding, bnfs and working code

Joe Gregorio wrote:
> [snip] 
> This is unsatisfactory for a lot of reasons, mostly related to how functional
> the spec actually is. There is a large set of cases I think URI Templates can
> be used for and I don't think the very simple templating mechanism defined
> covers nearly enough cases. I also think that the story around
> percent-encoding is hopelessly mired. For example, here are some examples
> that I hope show that the current {var} system is inadequate:

I'm rather sad that I have to agree.

> [snip] 
> We can keep our simple {var} expansion, but let's add in
> a default value:
> 
>     {var=default}
>          Simple substitution
> 
>     Example:
> 
>     URI Template
>        http://example.org/{fruit=orange}/
>     Template Var
>        fruit = "apple"
>     URI
>        http://example.org/apple
> 

Based on the template, this would actually expand to

  http://example.org/apple/

(note the trailing forward slash)

>     URI Template
>         http://example.org/{fruit=orange}/
>     Template Var
>         fruit is undefined
>     URI
>         http://example.org/orange
> 

Likewise, this would expand to

  http://example.org/orange/


> 
>     {<prefix|var[=default]}
>          Prefix var with prefix, emit empty string if
>          var is empty or undefined.
> 
>     URI Template
>         bar{</|var}/
>     Template Var
>         var := foo
>     URI
>         bar/foo/
> 

Again, we need to watch the trailing forward slashes.  If var is empty
or undefined, we end up with "bar/".

> 
>     {<postfix|var[=default]}
>          Append var with postfix, emit empty string if
>          var is empty or undefined.
> 
>     URI Template
>         bar/{>#home|var}
>     Template Var
>         var := foo
>     URI
>         bar/foo#home
> 

My first thought on seeing this was to wonder if we'd need some way of
indicating both a prefix and postfix for a single variable.

> 
>     {,sep|var1=def1, var2=def2, ...}
>            Substitute the concatenation of variable name,
>            "=", variable value. Join more than one var by the value
>            of 'sep'.
> 
>     URI Template
>         {,&|name,location,age}
>     Template Var
>         name := joe
>         location := NYC
>     URI
>          name=joe&location=NYC
> 

You mention later that it's not clear how optional and multiple
variables could be handled.  Using ? and * is an option, e.g.

 {,&|?name,location,*age}

which would indicate that name is optional, there must be one location
parameter, and there may be many age parameters.

  name := undefined
  location := NYC
  age := [12,13,14]

expands to location=NYC&age=12&age=13&age=14

> 
>      {&sep|var}
>           Treat var as a list and join the values in the list
>           with the given separator. Emit empty string if var
>           is empty or undefined
>      URI Template
>           {&/|segments}
>      Template Var
>           segments := ["a", "b", "c"]
>      URI
>           a/b/c
> 

I like this, but what about a prefix and postfix? e.g, I want the url
template to produce either http://example.org/home or a listing of
segments,

  http://example.org/home
  http://example.org/home/a
  http://example.org/home/a/b
  http://example.org/home/a/b/c

We can use the template...

  http://example.org/home/{&/|segments}

But this would always output the / following home. What if I want to
make that / conditional on the evaluation of the template var?  In
theory I could add an empty string as the first element in the segment
array and omit the slash from the template,

URI Template
  http://example.org/home{&/|segments}

Template Var
  segments := ["","a","b","c"]

URI

  http://example.org/home/a/b/c

> 
> 
>     {?opt|var}
>           Inserts opt if var is a string or non-zero length list.
> 
>    URI Template
>         {?/|segments}
>    Template Var
>        segments := ["a", "b", "c"]
>    URI
>         /
> 

Hmm.. ok, I see, this covers the previous case

URI Template

  http://example.org/home{?/|segments}{&/|segments}

Template Var
  segments := ["a","b","c"]

URI
  http://example.org/a/b/c

> [snip]
> In the example from Google search, all variable names in the {,}
> expansion are optional, i.e.
> none of those variables need be defined.
> 
>      http://www.google.com/search?q={,&|term,num}
> 

This example doesn't look quite right, if term=foo and num=2, wouldn't
this expand to:

  http://www.google.com/search?q=term=foo&num=2

I would think the correct template would be:

  http://www.google.com/search{??|term,num}{,&|term,num}

> Internationaliztion is also covered:
> 
>      http://www.google.com/search?q={term}
>      term := Îñţérñåţîöñåļîžåţîöñ
>       http://www.google.com/search?q=%C3%8E%C3%B1%C5%A3%C3%A9r%C3%B1%C3%A5%C5%A3%C3%AE%C3%B6%C3%B1%C3%A5%C4%BC%C3%AE%C5%BE%C3%A5%C5%A3%C3%AE%C3%B6%C3%B1
> 

While I'm perfectly happy restricting things to UTF-8, I'm wondering if
there isn't a simple means by which we can explicitly establish an
encoding within the template language. For instance, something like:

  {.UTF-8}http://www.google.com/search?q={term}

> [snip] 
> So this system is pretty capable without crossing over into
> the realm of Turing Complete. On the other hand, it is not
> without fault:
> 
>    1. Doesn't handle repeated query parameters.
>    2. Doesn't specify if variables are mandatory or optional.
>    3. Doesn't handle encodings besides UTF-8.

I think these can be easily dealt with as demonstrated above.

>    4. Template language is complex, cryptic.

It doesn't look like we're going to be able to get around this.

>    5. No handling of input validation, enums, ranges, etc.

I'm definitely not convinced that this is even going to be necessary.

>    6. Possible to define a self-inconsistent URI Template:
>          1. {&|fred}{<#|fred}

Stupid is as stupid does.

>    7. Prefixes and suffixes are redundant, as
>         they could be handled by using the '?' expansion.

Yes, but doing so does increase complexity somewhat.  I'll have to stew
on this one.

>    8. Comma expansions could have two strings, one to separate
>         name-value pairs (as now), the other to separate names from
>         values (now hard-coded to "=").

This would be simple if we could have a rule that separators had to be a
single character, e.g.

  {,&=|?name,location,*age}

  name=a&location=b&age=c

  {,/;|?name,location,age}

  name;a/location;b/age/c

>    9. Sensible defaults need to be invented to deal with parameter values
>        that are lists when not expected to be (or are not lists when
> expected to be) (see #6).

I've got no suggestions.

>   10. No specification for how to handle IRIs beyond "Turn an IRI Template
>        into a URI Template and then proceed."

This has been on my todo list for a while.  Let me stew on this for a
few days.

>   11. Need way to say "Insert this if some/none of these variables exist"
>         to strip trailing "?" from URIs with no parameters.
> 

No suggestions.

> [snip] 
> And finally to those of you thinking to yourself, "that would be so
> much better as working code", I present:
> 
>   http://code.google.com/p/uri-templates/
> 
> A Python implementation, with unit tests, requires 'tpg', the Toy
> Parser Generator.
> 

What, no erlang?

> [snip] 
> So it's clear, I don't believe this is a final or complete solution, but I think
> it's a good start and at least proves that expansions are a viable
> solution to the percent-encoding issue.
> 

Thank you for pushing forward on this.  I'll see if I can get a Java
impl going on this next week.

- James

>     Thanks,
>     -joe
> 

Received on Saturday, 13 October 2007 04:05:01 UTC