Re: [Uri-review] New posting of the "jms" URI scheme

On Tue, 21 Sep 2010 20:06:29 +0200, Bjoern Hoehrmann wrote:
> * Eric Johnson wrote:
>> We discussed this in the SOAP-JMS working group, and don't quite
>> understand your concern.
> 
> Say you have a SVG vector graphic that embeds a PNG bitmap graphic. That
> would look more or less like this:
> 
>   <svg ...>
>     <image xlink:href='
>                        xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>                        ...' .../>
> 
> So you have a lot of spaces and line breaks. The XML specification de-
> fines the line breaks are turned into spaces and the XLink specification
> defines that spaces, being disallowed in proper resource identifiers,
> are percent-encoded, so you get

! I wasn't aware that XLink mandated the transformation of spaces into 
www-url-encoded strings.

This particular sample works because the set of applications is known, 
and because the specified forms of encoding happen not to collide.  % 
is not in the base64 "alphabet".  If, instead of %20, the 
www-url-encoding used the (valid) alternative "+", the base64 encoded 
data would be corrupted (because the reverse transformation, '+' -> ' ' 
would remove information)

That is, at issue here is the encoding apparently mandated by XLink, 
not the data-scheme URL itself, which does not, I think, mandate how it 
is to be handled when encoded (the reference is to RFC 2045, part one 
of the MIME specification).  For that matter, if XLink had instead 
mandated the use of base64 or quoted-printable, the underlying base64 
representation would be corrupted (potentially ... at least by removing 
a required padding character that happens to precede a linefeed, at the 
end).  Unless there is a record of the transformations performed, in 
order, or an understanding of what they are, multiple transformations 
very well may fail to be reversed.  Only the application at the top 
level is likely to be able to disentangle this appropriately.

Depending upon application, the JMS URI may be encoded in a number of 
ways, and potentially multiply encoded.  It seems to me that it is out 
of scope for us to mandate how other specifications use our URI.  No?  
It should be handled "as a URI".  In XML, when a DTD defines the 
attribute type, CDATA (in which case normalization will happen, and 
embedded \n will become spaces, and multiple spaces will become single 
spaces), and when W3C XML Schema datatypes are used, AnyURI (probably; 
there may also be whitespace facets applied).  If using XLink, one 
presumably is aware that www-url-encoding is going to happen or is 
required to be applied, and so can perform the reverse operation.

So ... isn't this a concern at a higher layer than the URI 
specification layer?  The SOAP/JMS specification, for instance, might 
have to deal with encoding issues, but it seems as though the URI spec 
itself cannot.

Amy!
-- 
Amelia A. Lewis
Senior Architect
TIBCO/Extensibility, Inc.
alewis@tibco.com

Received on Tuesday, 21 September 2010 18:50:08 UTC