- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Mon, 30 Mar 2009 12:05:07 +0200
- To: Anne van Kesteren <annevk@opera.com>
- CC: "public-html@w3.org" <public-html@w3.org>
Anne van Kesteren wrote: >> It sounds like this is an edge case, in that that encoding could >> potentially contain decomposed characters, which would be mapped to a >> sequence of decomposed Unicode characters when the mapping is done in >> the most simple way. > > Yes, but because of that edge case the specification has this silly > requirement which affects all non-Unicode encodings. Decomposed > characters are easy to get using character escapes. Indeed, forgot about that. In that case, I think it would be best for iri-bis to lift this requirement. It doesn't make sense that the same character sequence is handled differently depending on where it came from. >>> IRIs work for that encoding was important but making them work for >>> HTML, CSS, etc. was not.) >> >> I'd say they work just fine; you just need to preprocess them. > > The preprocessing you need to do involves converting the input to a URI > which seems highly suboptimal. You could also preprocess to an IRI. Anyway, the actual processing is the same, so what we're really discussing is simply how and where it's defined. >> And also, the work-in-progress revision of RFC 3987 already addresses >> this (at least partly), by introducing LEIRIs >> (<http://tools.ietf.org/html/draft-duerst-iri-bis-05#section-7>). > > LEIRIs are not a solution. Please elaborate. BR, Julian
Received on Monday, 30 March 2009 10:05:49 UTC