- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sun, 11 Sep 2011 02:50:19 +0200
* Daniel Holbert wrote: > In particular: when a "#" character is followed by ">" or "<" in a data >URI, I propose that we *don't* treat the "#" as a delimiter, and instead >just treat it as part of the encoded document. Your proposal does not explain whether this applies to base64 encoded ones, whether the angle brackets have to occur literally or if they can also occur in their percent-encoded form, or how you handle multiple '#' characters like in data:...,...#...<...#example. You also don't say on which layer this would happen. Obviously having this in the URI syntax specification with an expectation that all parsing libraries would be updated to treat the 'data' as a special case is unlikely to go down well (problem starting with angle brackets being disallowed entirely). If treating the part after the first "#" as fragment identifier doesn't cause compatibility problems, as you seem to be suggesting, then that's great, explaining URI processing would be much simpler. We also do not have special rules for <http://example.com/search?q=#hashtag> despite someone crafting such an address most likely means the "#" to be data. There are a number of implementations where the "#" is treated as data, 'javascript' and 'mailto' come to mind, but there it's unreliable and not widely used, and, more importantly, it's all or nothing, not guess- work. You have to escape all sorts of characters in 'data' URLs to make them work reliably, you have to escape spaces for instance in order to use them as part of a white-space separated list of URLs or other syntax that relies on URLs containing no spaces, and you have to escape '#'s so they work reliably right now and for however long the current pack of browsers will be around, even if you don't care about all the non- browser implementations that are unlikely to support this. If there isn't very clear evidence that this is needed for reasons of compatibility, it seems preferable by far to have simpler rules that actually reflect how this stuff works everywhere than have some magic rules that apply some of the time that robust code cannot rely upon. I wouldn't mind such fixups in the address bar, as that is a user input do what I mean interface, but beyond that it just adds complexity for very little convenience in edge cases. -- Bj?rn H?hrmann ? mailto:bjoern at hoehrmann.de ? http://bjoern.hoehrmann.de Am Badedeich 7 ? Telefon: +49(0)160/4415681 ? http://www.bjoernsworld.de 25899 Dageb?ll ? PGP Pub. KeyID: 0xA4357E78 ? http://www.websitedev.de/
Received on Saturday, 10 September 2011 17:50:19 UTC