W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > July to September 2007

[Bug 2249] R-257: Health warning needed about percent-escaping URIs

From: <bugzilla@wiggum.w3.org>
Date: Wed, 19 Sep 2007 23:50:11 +0000
To: www-xml-schema-comments@w3.org
Message-Id: <E1IY9J5-0001oU-Ed@wiggum.w3.org>


------- Comment #2 from cmsmcq@w3.org  2007-09-19 23:50 -------
It's not clear that XSDL 1.1 actually needs a health warning about
escaping IRIs / anyURI values; it prescribes the escaping algorithm
found in section 3.1 of RFC 3987 (http://www.ietf.org/rfc/rfc3987.txt)
and that algorithm is said by that spec to be idempotent:

   The above mapping from IRIs to URIs produces URIs fully conforming to
   [RFC3986].  The mapping is also an identity transformation for URIs
   and is idempotent;  applying the mapping a second time will not
   change anything. 

Unless I'm missing something, therefore, the premise of this issue
(which I understand to be that we should warn users not to escape
their anyURI values more than once) is false for 1.1, and no change
is needed in 1.1 on this account.

On the other hand, as far as I can tell, the escaping algorithm 
specified by XSDL 1.0, borrowed from section 2.4 of XLink
is also idempotent.  So I don't see what the health warning is
supposed to be warning the users about, unless it's about the 
possible danger of confusion if they try to use anyURI values
which already contain percent signs or hashes which have not 
already been escaped.  (And if it is, then the chance of alleviating
the confusion in a note may be somewhat smaller than the chance
of baffling the reader.)  Note that neither # or % may appear
in an RFC 2396 URI unescaped: only 'nonreserved' characters may
appear unescaped according to RFC 2396.  RFC 2732 does not
change this rule, as far as I can tell.

If the WG really wants a health warning for the 1.0 spec, perhaps
something like this will do the trick:

  Note:  the escaping mechanism prescribed here does not
  escape the percent sign (%) or the number sign (#); as a
  result, escaping a string a second time will not change
  the string.  Nevertheless, the observations in RFC 2396 
  about safety and risks of escaping mechanisms remain true,
  and implementors should bear them in mind.

Speaking for myself, however, I incline to think that we should
close this bug both for 1.0 and for 1.1 with a resolution of
WORKSFORME, signaling that now (as opposed to two years ago)
we do not see any need for change, health warning, or 
clarification.  (Alternatively, someone could explain to me, again,
why there is a problem after all, all appearances to the
contrary notwithstanding.)  
Received on Wednesday, 19 September 2007 23:50:16 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:50:06 UTC