Non-ASCII characters in namespace URIs

Hi,

   XML 1.0, HTML 4.01, XPointer, the Character Model for the World Wide
Web and many other documents require (or in case of HTML 4.01, recommend
to) processors to apply a special encoding algorithm to URIs with
non-ASCII characters, to summarize a stable definition of this algorithm
could be

  => encode as UTF-8
  => NFC normalization
  => apply URI (%xx) encoding to each byte

Does this also apply to namespace URIs? I.e. are e.g.

  http://björn.höhrmann.de/ and
  http://bj%C3%B6rn.h%C3%B6hrmann.de/

equal? What about the case of those hex digits? Or has the comparison to
take place after UTF-8 encoding an then character by character or
probably byte by byte? What about normalization? The recommendation
keeps mum. Maybe the errata should add some clarification here.

(bcc to: xml-dev and www-international)

regards,
-- 
Björn Höhrmann { mailto:bjoern@hoehrmann.de } http://www.bjoernsworld.de
am Badedeich 7 } Telefon: +49(0)4667/981028 { http://bjoern.hoehrmann.de
25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/

Received on Sunday, 21 October 2001 16:30:36 UTC