- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Wed, 25 Jun 2008 01:30:52 +0200
- To: uri@w3.org
Anne van Kesteren wrote: > It's also transmitted as another encoding than UTF-8 > (while the path component _is_ transmitted as UTF-8). One of the best things with IRIs is that they are KISS: They use one and only one charset, the document charset, wherever they contain non-ASCII characters. For document types permitting NCRs or similar entities it means whatever it means in this document type, i.e. typically Unicode points or *error* (e.g., using ü in XML without definition). What %hh means depends on the server, it might be just percent-encoded UTF-8 as specified in RFC 3987, or any binary gibberish (e.g., in data: URIs), or legacy stuff for FTP servers on top of a legacy file system. But an iso-8859-1 "ü" in an iso-8859-1 document is an "ü", also in *all* parts of an IRI, not only the path. Frank
Received on Tuesday, 24 June 2008 23:29:52 UTC