Re: URI components question

michele vivoda wrote:

> My conclusion is that (at least) query component cannot be
> unescaped. Is this right, does it apply only to query or
> unescaped components should not exist at all ?

Everywhere.  The standard says query = *( pchar / "/" / "?" )

In other words you can use "/", "?", and any unescaped pchar
directly without percent-escapes.  A parser that found the
query is not more interested in "?" starting the query.  It
is also not more interested in "/" used in the path before
this "?".

If you check pchar you'll find that it allows to use ":" and
"@" directly, similar reasons, the only places where ":" and
"@" are relevant is before the path / query.

But if the parser reached the query it still has to find its
end, e.g. ">", '"', "#", or white space.  These characters
must be escaped if they are part of the query, pchar doesn't
contain them directly.

But you can use "&" and "=" directly in a query, as in your
example p1=R%26D&p2=q   The issue is that the standard does
not define an internal structure of queries, this could be
anything depending on the scheme and / or server.

E.g. for http some servers accept ";" instead of "&" to
delimit parameters (key=value or simply value).  So if you
send query strings to servers where "&", ";", and "=" have a
meaning as delimiters, you can't escape them if that's what
you want, otherwise you must escape them.

In your example you have a value R&D for p1.  Because "&" is
used to delimit parameters in your query you need  p1=R%26D
You would also %-encode "=" or ";" within keys or values of
queries sent to normal http-servers.

Actually you could get away with  p1=R&D&p2=q  if "&" is not
allowed in key names, and if singleton values (no key =) are
also not allowed (excl. the special "isindex" query):

p1=R plus D&p2=q would be an invalid key D&p2
p1=R plus D plus p2=q would be an invalid singleton D
p1=R&D plus p2=q would be the last chance to make sense of it.

                       Bye, Frank

Received on Friday, 27 January 2006 11:12:31 UTC