Re: "#" in IRI references

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Tue, 22 Oct 2002 07:23:18 +0200
To: John Cowan <jcowan@reutershealth.com>
Cc: www-international@w3.org
Message-ID: <3dc1d0a2.18643237@smtp.bjoern.hoehrmann.de>

* John Cowan wrote:
>>   ichar          = << allowed character of the UCS [ISO10646] >> |
>>                    space | idelims | unwise
>>   idelims        = "<" | ">" | "#" | "%" | <">

>It looks to me like "idelims" are things that should *not* appear in
>IRIs.  They have to be delimitable by something, and <> brackets and
>double quotes are appropriate.

Yes. However, the most popular way to delimit URI references in an
ambiguous context is white-space and I do not see any good reason to
allow unescaped spaces in IRI References. I really wonder how an
application should deal with unescaped spaces in IRI References if it
has to deal with a white-space separated list like e.g.:

  xsi:schemaLocation = 'http://www.example.org/Report
                        http://www.example.org/Report Schema.xsd'

Sure, you can escape the space here to remove the ambiguity, but if you
have to escape the space anyway, you could just dissallow them to avoid
such problems. The only reason given in the draft I can follow at least
a little is 5.1.b.3, convenience, but reaching convenience could be done
less harmful through error recovery constraints; dissallow them but
require applications to accept and replace them.
