W3C home > Mailing lists > Public > www-ws-desc@w3.org > February 2007

Re: [QUESTION 5] Are ";" and "=" harmful characters before the "?"

From: Youenn Fablet <youenn.fablet@crf.canon.fr>
Date: Fri, 23 Feb 2007 17:46:34 +0100
To: Jonathan Marsh <jonathan@wso2.com>
Cc: "'keith chapman'" <keithgchapman@gmail.com>, "'www-ws-desc'" <www-ws-desc@w3.org>
Message-id: <45DF1A6A.6030901@crf.canon.fr>

Jonathan Marsh wrote:
> below
> Jonathan Marsh - http://www.wso2.com - http://auburnmarshes.spaces.live.com
>> -----Original Message-----
>> From: Youenn Fablet [mailto:youenn.fablet@crf.canon.fr]
>> Sent: Friday, February 23, 2007 2:09 AM
>> To: Jonathan Marsh
>> Cc: 'keith chapman'; 'www-ws-desc'
>> Subject: Re: [QUESTION 5] Are ";" and "=" harmful characters before the
>> "?"
>> +1 for including '&' in the list.
>> Concerning ';' and '=', leaving them in the list would let the client
>> application decide whether to %-encode them or not.
>> My question is then:  would the following uris be equivalent or not in
>> the HTTP binding context?
>>     1) http://example.org/name;v=1.1
>>     2) http://example.org/name;v=1%2E1
>>     3) http://example.org/name%3Bv=1.1
>> 1 and 2 are clearly equivalent.
>> What about 1 and 3?
>> I would hope that they are also equivalent in the WSDL/HTTP binding
>> context.
>> According section 2.2 of rfc3986, URIs that differ in the replacement of
>> a reserved character with its corresponding percent-encoded octet are
>> not equivalent. I would conclude that 1 and 3 are not equivalent.
> Yes.  Though if they aren't equivalent, I don't know what "let the client
> application decide" means - seems like an interop hit.  I infer that the
> client SHOULD NOT encode them if they result in a URI that isn't equivalent.
I was mixing two discussions here. 
When ';' and '=' are in the location template, the client SHOULD NOT 
encode it.
When ';' and '=' appear in the parameter value, we end up with potential 
interop issues.
It might then be better to at least suggest (SHOULD) what people should 
do about these ones.

>> I also note that ':' may be let unencoded as per the status quo.
>> What would happen in the following case?
>>     @whttp:location="{value}"
>>     value parameter = 'urn:example.org'.
>>     endpoint/@address="http://example.org/"
>> The templating mechanism will produce an absolute uri "urn:example.org".
>> The final request URI would then be "urn:example.org" while the
>> intention might be to have something like
>> "http://example.org/urn:example.org".
> Yes, this is one of the many ways one could hang themselves.  I suspect
> there are similarly unexpected cases that will arise if we escaped the :.
> I'm not convinced we can really help here.
If we encode ';', we would then have something like 
Or am I missing something?
>> The following case is also interesting:
>>     @ whttp:location="{value}"
>>     value parameter = ':8080'.
>>     endpoint:@address="http://example.org"
>> The final request URI would then be "http://example.org:8080" which may
>> not be of practical use.
> Is :8080 a valid relative URI?  Even if so, how does encoding the ":" make
> it any more useful?
Yep, you are right, encoding ':' may not help on this.
To be noted that many edge cases come up with location template that 
begin with a parameter value.
I think that it would be good practice to begin the location template by 
some characters, '/', '?' or '#' typically.

>> The bad thing is that client applications that escape ':' and the ones
>> that do not escape ':' may come up
>> with very different request URIs. We may also run into edge cases with
>> '@', see section 7.6 of rfc3986.
>> Reading section 2.2 of rfc3986 and with the above edge cases,
>> I am currently thinking that it may be simpler and more interoperable to
>> state that all URI reserved characters that appear in encoded parameters
>> SHOULD be encoded.
>> Users that do not want to encode them should have good reasons not to do
>> that.
> Do you mean they SHOULD be pre-encoded in the XML?  Or that implementations
> SHOULD encode them but are conformant if they don't (which seems to be
> contrary to your point that these are interop issues.)  Why not just say
Your second interpretation is correct.
A MUST may be better in fact.
After all, the wsdl author has the choice between encoded and raw mode 
for parameter values.

>> I would also recommend that we promote this in our test-suite,
>> especially in our message assertions.
>> Regards,
>>     Youenn
Received on Friday, 23 February 2007 16:46:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:58:46 GMT