[Bug 1502] [F&O] escape-uri encompasses & s/b split into 2 distinct functions

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1502





------- Additional Comments From timbl@w3.org  2005-06-17 02:05 -------
Michael (Kay), the two functions are *quite different* as I understand it.  It is not that one operates on 
part and the other on a whole URI.  You can feed a whole or part URI to either.

encode-for-uri(s) takes ANY STRING (not necessarily any relation to a URI) and encodes it as a 
something which can be transferred as path segment.  It is an encoding in that there is a corresponding 
decode.  if you use it twice, then you get something double-encoded. Example: Use when encoding a 
string argment to a HTML-form-style query.

clean(s) takes a URI (or part) and just cleans it up so that any unacceptable characters are encoded in 
ASCII.  It doesn't encode anything which is already encoded. There is no inverse function, as you can't 
tell what characters were not originally clean in the original string.  If you use it twice, its the same as 
using it once. once.  Example:  use when encoding an IRI for transmission in HTTP.

Why would you want to perform both operations?  The result of encode-for-uri will allways be clean so 
performing a clean()n will have no effect.  The result of cleaning a URI will be a clean URI whcih one may 
want to then encopde as a URI encoded parameter within a new query URI being built up. But that is a 
separate function, and should be programmed as such.

Received on Friday, 17 June 2005 02:05:09 UTC