- From: Youenn Fablet <youenn.fablet@crf.canon.fr>
- Date: Mon, 22 Jan 2007 16:10:50 +0100
- To: Jonathan Marsh <jonathan@wso2.com>
- Cc: "'www-ws-desc'" <www-ws-desc@w3.org>
Jonathan Marsh wrote:
> I'm returning to this topic based on my AI to look at CR117 further. Youenn
> does a good job below of pointing out more of the potential issues. IMO
> this can be boiled down to two questions:
>
> 1) Do we allow the user the power to create URLs from data that either
> result in malformed URIs, non-reversable data, or both?
> 2) If we do, can we advise the user, or the WSDL processor, on how to bind
> the data safely?
>
> Youenn's suggestion below of providing a "safe mode" in which %-encoding is
> applied to the data before inserting it into a template is interesting. I
> can imagine this being exposed as a feature of the templating language
> directly:
>
> whttp:location="{raw}?more={%encoded}"
>
> Where the % directs the WSDL processor to encode the data (otherwise it's
> stuffed in raw). Actually, the reverse would probably be better - encode
> unless the user makes an effort to ask for the raw mode:
>
> whttp:location="{#raw}?more={encoded}"
>
> or something like that even though it's not as self-explanatory.
>
> We'd have to accompany this with some warnings to users that the raw mode
> must be used carefully, e.g. with appropriate schema types restricting the
> power for malformed URIs and the inability to generate server stubs in
> XML-centric implementations.
>
>
+1.
IMHO, the raw mode is a powerful feature that should be kept to advanced
users/scenarios.
Having a simpler encoded mode makes a lot of sense to me, especially if
it makes SOAP-Response usable.
> All that smells a bit too much like new features at the last minute to me!
>
It might be new features, although if we take the 80/20 bar, the encoded
mode might be more appealing than the raw mode,
at least in the SOAP world and even for simple HTTP services.
> And it only gets us part way, as it doesn't solve the problem of creating
> non-reversible templates like {x}{y}. That one's much harder. Simply
> preventing adjoining templates won't work - how does one deconstruct
> {first}-{last} if first='Jean-Jacques' and last='Moreau'? There needs to be
> a delimiter between each template that cannot appear in the data.
>
Exactly, and if we encode data, these delimiters are easy to select and
non-ambiguous locations are easy to assert.
> One can get pretty fancy and context-sensitive in figuring out which
> characters to appear, but a lowest common denominator of approach seems
> workable and allows any data to be encoded without harm and a representing
> IMO a loss of functionality lower than the potential for simple mistakes:
>
> 1) %-encoding each character in the XML except a-z, A-Z, 0-9, "-", ".", "_",
> "~". Per RFC3986 sec 2.4 this escaping is performed prior to insertion into
> the URL in place of the template.
> 2) Force templates to be separated by a character sequence containing at
> least one unescaped character not in the above set. A BNF for this seems
> possible though I failed in my simple attempt to create it...
>
I would not go as far as disallowing these ambiguous templates.
Triggering a warning would be sufficient.
> Thus these would be disallowed:
> {foo}{bar}
> {foo}-{bar}
> {foo}%20{bar}
>
> And these would be allowed:
> {foo}.xml
> {foo}/{bar}
> {foo}?{bar}
> /{foo}+{bar}/baz
> ?{foo},{bar}
> ?{foo}={bar}
> ?foo={foo}&bar={bar}
> ?foo={foo}-and-then-some&bar=more-than-{bar}
>
> Jonathan Marsh - http://www.wso2.com - http://auburnmarshes.spaces.live.com
>
>
>
>> -----Original Message-----
>> From: www-ws-desc-request@w3.org [mailto:www-ws-desc-request@w3.org] On
>> Behalf Of Youenn Fablet
>> Sent: Friday, January 05, 2007 3:11 PM
>> To: www-ws-desc
>> Subject: Clarifications on CR117
>>
>>
>> After yesterday's discussion about CR117, I have the following
>> comments/precisions/questions.
>> I hope this helps clarifying the issue(s).
>>
>> 1) Question mark
>> a) Having a '?' in the values of the parameter may lead to issues: the
>> query string may begin in advance:
>> examples:
>> whttp:location="Send/{title}/index?" with two parameters (title and
>> author) may lead to something like: Send/What?/index?author="unknown".
>> There might be applications that will be able to handle that but others
>> may not be able to correctly handle this...
>>
>> b) To be noted that client applications will need to check at runtime
>> whether the location and parameter values have a '?' in order to
>> correctly build the query string
>> Let's have whttp:location="/Send/{title}"
>> if title is "What" and author is "Unknown&Co", we would have:
>> /Send/What?author=Unknown&Co
>> if title is "What?ok" and author is "Unknown&Co", we would have:
>> /Send/What?ok&author=Unknown&Co
>> This might need to be clarified in the specification (cf. phillipe AI).
>>
>> Please note also the use of the "&" in this example. Other reserved
>> characters (#) may also have some impact. Hence the proposal at the end
>> of this message.
>>
>> 2) URI escaping
>>
>> Characters from @address, @whttp:location or from parameter values may
>> need to be escaped before being put in the HTTP request.
>> Characters from @address and @whttp:location are escaped as there is a
>> mapping defined by their type xs:anyURI.
>> What should be done with characters from parameter values is not
>> specified IIRC.
>> We might want to clarify whether the escaping happens before or after
>> the replacement of the parameter name by its value.
>> If we have @whttp:location="Send%{int}" and int is "20", what do we have
>> is either Send%20 or Send%2520.
>> Am I missing something?
>>
>> 3) Reversibility
>>
>> In some cases, the templating mechanism may be ambiguous.
>> This may be due to the templates: whttp:location="{country}{zipcode}"
>> may be ambiguous or not depending on the types of country and zipcode.
>> This may also be due to the use of special characters within parameter
>> values: whttp:lcation="" may be ambiguous if some parameter values use
>> '&' for instance.
>>
>> It makes perfect sense to allow the description of such non-reversible
>> URI construction.
>> It also makes sense IMHO to ensure the reversibility of the URI
>> construction, especially for SOAP.
>> While this is feasible to do it by constraining the type of the
>> parameters as arthur suggested, I think it would be better to have a
>> more lightweight and general solution for wsdl users : binding simple
>> IRI-style-compliant structures to either SOAP-response or SOAP
>> request-response would be quite useful.
>>
>> One potential solution is to have two parameter value serialization modes:
>> - one straightforward that simply copies the parameter values
>> - another one that url encode all URL reserved/special characters
>> (/,?,$,&,=,.)
>> In the SOAP case, the second serialization mode might be the preferred
>> one.
>> We could then add within the WSDL component model a property that tells
>> the WSDL processor how parameter values are handled for a particular
>> binding component.
>> The reversibility would then be ensured by the use of both the second
>> serialization mode and simple templating rules like:
>> - have an empty location value: all parameter values are encoded as
>> query parameters
>> - always put a '/' between parameter values
>>
>> What do you think?
>> Youenn
>>
>
>
>
>
Received on Monday, 22 January 2007 15:11:10 UTC