RE: Comments on draft-fielding-uri-rfc2396bis-07

On Fri November 5 2004 13:00, Graham Klyne wrote:
> 
> I checked my code and test cases, and I clearly decided when reading the 
> spec that '@' did not need to be escaped by a general URI handler.  I take 
> my lead here from section 2.4, "When to Encode or Decode".
> 
> I think that is correct behaviour here:  in this respect, I think the 
> w3.org implementation you mention is arguably incorrect, but this is 
> probably a pretty harmless error, as all %-encoding would normally be 
> reversed after the URI path component has been extracted.

Yes; my code takes the conservative approach and encodes
on generation and decodes on parsing.  Encoding seems to
be safe in general; parsers are supposed to decode, and I'm
not sure that there isn't some pathological URI parser that
might choke on an unencoded '@' (interpreting it as part of
an "authority" component).

> For generic URI handling, I have taken the approach of encoding only those 
> characters that are clearly required by the URI spec to be encoded.

I'm primarily concerned with mailto URI generation and parsing.
That uses general URI parsing into components verbatim from
the RFC 2396 Appendix B regular expression (which happens to be
independent of '@' encoding).  Encoding takes into account
URI excluded characters, RFC 2396 component-specific
reserved characters, and mailto-specific reserved characters.

> The software interfaces accordingly separate the character escaping logic 
> from the test used to decide which characters need to be escaped, and then 
> provides common test functions for generic URI components.

Mine uses several ctype-like functions that identify URI
"excluded" characters (no longer identified in the draft
under discussion), and characters which are reserved in the
various URI components (ditto).

Received on Friday, 5 November 2004 19:55:00 UTC