- From: Youenn Fablet <youenn.fablet@crf.canon.fr>
- Date: Mon, 22 Jan 2007 16:10:50 +0100
- To: Jonathan Marsh <jonathan@wso2.com>
- Cc: "'www-ws-desc'" <www-ws-desc@w3.org>
Jonathan Marsh wrote: > I'm returning to this topic based on my AI to look at CR117 further. Youenn > does a good job below of pointing out more of the potential issues. IMO > this can be boiled down to two questions: > > 1) Do we allow the user the power to create URLs from data that either > result in malformed URIs, non-reversable data, or both? > 2) If we do, can we advise the user, or the WSDL processor, on how to bind > the data safely? > > Youenn's suggestion below of providing a "safe mode" in which %-encoding is > applied to the data before inserting it into a template is interesting. I > can imagine this being exposed as a feature of the templating language > directly: > > whttp:location="{raw}?more={%encoded}" > > Where the % directs the WSDL processor to encode the data (otherwise it's > stuffed in raw). Actually, the reverse would probably be better - encode > unless the user makes an effort to ask for the raw mode: > > whttp:location="{#raw}?more={encoded}" > > or something like that even though it's not as self-explanatory. > > We'd have to accompany this with some warnings to users that the raw mode > must be used carefully, e.g. with appropriate schema types restricting the > power for malformed URIs and the inability to generate server stubs in > XML-centric implementations. > > +1. IMHO, the raw mode is a powerful feature that should be kept to advanced users/scenarios. Having a simpler encoded mode makes a lot of sense to me, especially if it makes SOAP-Response usable. > All that smells a bit too much like new features at the last minute to me! > It might be new features, although if we take the 80/20 bar, the encoded mode might be more appealing than the raw mode, at least in the SOAP world and even for simple HTTP services. > And it only gets us part way, as it doesn't solve the problem of creating > non-reversible templates like {x}{y}. That one's much harder. Simply > preventing adjoining templates won't work - how does one deconstruct > {first}-{last} if first='Jean-Jacques' and last='Moreau'? There needs to be > a delimiter between each template that cannot appear in the data. > Exactly, and if we encode data, these delimiters are easy to select and non-ambiguous locations are easy to assert. > One can get pretty fancy and context-sensitive in figuring out which > characters to appear, but a lowest common denominator of approach seems > workable and allows any data to be encoded without harm and a representing > IMO a loss of functionality lower than the potential for simple mistakes: > > 1) %-encoding each character in the XML except a-z, A-Z, 0-9, "-", ".", "_", > "~". Per RFC3986 sec 2.4 this escaping is performed prior to insertion into > the URL in place of the template. > 2) Force templates to be separated by a character sequence containing at > least one unescaped character not in the above set. A BNF for this seems > possible though I failed in my simple attempt to create it... > I would not go as far as disallowing these ambiguous templates. Triggering a warning would be sufficient. > Thus these would be disallowed: > {foo}{bar} > {foo}-{bar} > {foo}%20{bar} > > And these would be allowed: > {foo}.xml > {foo}/{bar} > {foo}?{bar} > /{foo}+{bar}/baz > ?{foo},{bar} > ?{foo}={bar} > ?foo={foo}&bar={bar} > ?foo={foo}-and-then-some&bar=more-than-{bar} > > Jonathan Marsh - http://www.wso2.com - http://auburnmarshes.spaces.live.com > > > >> -----Original Message----- >> From: www-ws-desc-request@w3.org [mailto:www-ws-desc-request@w3.org] On >> Behalf Of Youenn Fablet >> Sent: Friday, January 05, 2007 3:11 PM >> To: www-ws-desc >> Subject: Clarifications on CR117 >> >> >> After yesterday's discussion about CR117, I have the following >> comments/precisions/questions. >> I hope this helps clarifying the issue(s). >> >> 1) Question mark >> a) Having a '?' in the values of the parameter may lead to issues: the >> query string may begin in advance: >> examples: >> whttp:location="Send/{title}/index?" with two parameters (title and >> author) may lead to something like: Send/What?/index?author="unknown". >> There might be applications that will be able to handle that but others >> may not be able to correctly handle this... >> >> b) To be noted that client applications will need to check at runtime >> whether the location and parameter values have a '?' in order to >> correctly build the query string >> Let's have whttp:location="/Send/{title}" >> if title is "What" and author is "Unknown&Co", we would have: >> /Send/What?author=Unknown&Co >> if title is "What?ok" and author is "Unknown&Co", we would have: >> /Send/What?ok&author=Unknown&Co >> This might need to be clarified in the specification (cf. phillipe AI). >> >> Please note also the use of the "&" in this example. Other reserved >> characters (#) may also have some impact. Hence the proposal at the end >> of this message. >> >> 2) URI escaping >> >> Characters from @address, @whttp:location or from parameter values may >> need to be escaped before being put in the HTTP request. >> Characters from @address and @whttp:location are escaped as there is a >> mapping defined by their type xs:anyURI. >> What should be done with characters from parameter values is not >> specified IIRC. >> We might want to clarify whether the escaping happens before or after >> the replacement of the parameter name by its value. >> If we have @whttp:location="Send%{int}" and int is "20", what do we have >> is either Send%20 or Send%2520. >> Am I missing something? >> >> 3) Reversibility >> >> In some cases, the templating mechanism may be ambiguous. >> This may be due to the templates: whttp:location="{country}{zipcode}" >> may be ambiguous or not depending on the types of country and zipcode. >> This may also be due to the use of special characters within parameter >> values: whttp:lcation="" may be ambiguous if some parameter values use >> '&' for instance. >> >> It makes perfect sense to allow the description of such non-reversible >> URI construction. >> It also makes sense IMHO to ensure the reversibility of the URI >> construction, especially for SOAP. >> While this is feasible to do it by constraining the type of the >> parameters as arthur suggested, I think it would be better to have a >> more lightweight and general solution for wsdl users : binding simple >> IRI-style-compliant structures to either SOAP-response or SOAP >> request-response would be quite useful. >> >> One potential solution is to have two parameter value serialization modes: >> - one straightforward that simply copies the parameter values >> - another one that url encode all URL reserved/special characters >> (/,?,$,&,=,.) >> In the SOAP case, the second serialization mode might be the preferred >> one. >> We could then add within the WSDL component model a property that tells >> the WSDL processor how parameter values are handled for a particular >> binding component. >> The reversibility would then be ensured by the use of both the second >> serialization mode and simple templating rules like: >> - have an empty location value: all parameter values are encoded as >> query parameters >> - always put a '/' between parameter values >> >> What do you think? >> Youenn >> > > > >
Received on Monday, 22 January 2007 15:11:10 UTC