- From: L. David Baron <dbaron@dbaron.org>
- Date: Tue, 13 Mar 2007 16:29:52 -0700
http://www.whatwg.org/specs/web-apps/current-work/#terminology says: # For readability, the term URI is used to refer to both ASCII # URIs and Unicode IRIs, as those terms are defined by [RFC3986] # and [RFC3987] respectively. On the rare occasions where IRIs # are not allowed but ASCII URIs are, this is called out # explicitly. This is rather misleading, since backwards compatible use of URIs is not ASCII-only. While IRIs are a superset of conformant URIs, IRIs are a subset of real-world-URIs, since they have the encoding fixed to UTF-8. Backwards-compatible URI handling tries to send the same sequence of bytes that was in the document back to the server, percent-encoded byte-by-byte, by encoding the URI based on the encoding of the document. I tend to think it would be good that new uses of URIs/IRIs document that they are really IRIs and therefore this reverse-encoding behavior should not be used, but instead encoding should be done as UTF-8. The repeated language in the spec that something is a URI or IRI doesn't make sense -- it really does need to be one or the other. (In Mozilla's codebase such distinctions are easy to implement since we have to pass along the encoding of the document every time we create a URI in order to get this backwards-compatible behavior. Failing to do so makes the code use UTF-8, which means, I think, that it's an IRI. At least, it's easy to implement if the things that are URIs and the things that are IRIs go through the same codepath.) It would probably be good if the spec documented how the encoding issues in URIs are actually handled. (My understanding of this stuff may be a bit off, although this also isn't the clearest explanation I could make of what I do know about it...) -David -- L. David Baron <URL: http://dbaron.org/ > Technical Lead, Layout & CSS, Mozilla Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20070313/9f6e8cf5/attachment.pgp>
Received on Tuesday, 13 March 2007 16:29:52 UTC