URI templates: comments on -04 draft

Assorted comments on http://tools.ietf.org/html/draft-gregorio-uritemplate-04:

1.
The expansion of {var} is not guaranteed to have no reserved chars even when that is what the template author wants. If var's value is a list of strings the expansion will include (reserved) commas.
I think this will be problematic when a template author uses commas as separators in their URI design.
A template such as "/{name},{org},{role}" should be ok, but the current draft allows unexpected results such as the party providing the name value being able to set the org and role fields by providing a list for the name (which the template author didn't anticipate).
We could blame the template author (and force them to use a different URI design), but a better solution would be for {var} never to introduce commas, and an explicit comma-operator {,var} for comma-separated lists.

2.
It would be good to add the following as example variables
  dot := "."
  dotdot := ".."
These values have special meaning in URIs, yet use unreserved chars (so %-escaping them has no effect).
I suggest simply disallowing these values -- its not ideal, but better than allowing the URI structure to be morphed by unexpected variable values.

  "A template processor SHALL NOT accept a variable value of "." or ".." if an expansion
  uses the value as a path segment. To allow simpler implementations, a template processor
  MAY reject any "." or ".." value, even in expansions where the dots do not have special meanings."

3.
Parsing a template and performing expansions in a single pass should not be a design consideration. Templates, like URIs, have limited size (eg <~ 4 KB, never MBs) so no implementation needs single pass processing, but it does unnecessarily crimp design choices.
I suggest deleting: "Implementations are able to parse the template and perform the expansions in a single pass" from section 1.3.

4.
Limiting defaults to unreserved and %-encodings means a template author cannot be certain when a variable was undefined -- the variable could have been present with the same value as the default.
I suspect knowing when a variable was not defined will often be useful.
I think it would be better for defaults to be allowed to use any chars allowed in URIs (ie copy the default straight into the URI being constructed, just like a literal in the template).
This works best with 1 default per expression, which might imply only 1 variable per expression (which I also think would be a good simplification).

5.
A default for a variable that holds name=value pairs (eg an "explode" variable such as an associative array) doesn't make much sense if it can only provide a default value but no default name.
Perhaps defaults for variables with the explode modifier are not really meant to be used, but need to be specified to keep the syntax consistent. Yuck.
A default that applied to the whole expression, not just the variable, would work. Redefining the explode modifier is another alternative (see earlier email).

6.
A template such as http://{userid}.example.com/ fails when the userid has non-ASCII chars, even though these are now valid in domain names. The spec always %-escapes when it needs to use punycode in some situations.
It is not sufficient to define the variable value to already be in punycode -- as the whole point of templates is that the user doesn't know where a variable (such as userid) will go in a URI.
One solution is for a template to build an IRI, then use normal IRI-to-URI rules. Chars allowed in IRIs, but not in URIs, would NOT be %-escaped by the template processor, but might be %-escaped in the subsequent IRI-to-URI mapping.

7.
[Section 3.4 "Variable and modifier expansion", 2nd paragraph]
Taking a prefix or suffix should be done BEFORE applying any %-escaping.

8.
Taking a prefix or suffix of a list expansion AFTER creating the combined string does not play well with applying it before %-escaping.

9.
The description of the slash operator [section 2.2] says it "separates" path segments. I suggest it says path segments are "prefixed" by slash. It keeps the wording consistent with the '.' operator and makes it clearer that an initial slash is included.

--
James Manger

Received on Saturday, 10 April 2010 12:32:16 UTC