- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Sat, 28 Jun 2008 12:16:42 +0200
- To: Ian Hickson <ian@hixie.ch>
- CC: Alexey Proskuryakov <ap@webkit.org>, Henri Sivonen <hsivonen@iki.fi>, HTML WG <public-html@w3.org>
Ian Hickson wrote: > Actually, while this applies to forms (and WF2 mentions it), it doesn't > seem to apply to regular links, where unencodable characters just get > turned into question marks by IE and Opera. Safari and Mozilla each do > their own thing (&-escape and use UTF-8 respectively) so I've gone with > the more interoperable IE/Opera behaviour in the spec. According to <http://lists.w3.org/Archives/Public/public-html/2008Jun/0358.html>, Safari 3 uses question marks. > This causes minor dataloss (the author has to go out of his way to include > these characters in the first place, and it's obvious in testing), but > it's not as bad as data corruption (there's no way for the server to know > on a byte-by-byte basis what encoding Mozilla's using) or data ambiguation > (there's no way to know if the original in "?%26%239786%3B" was a smiley > or the string "☺", something which has affected me as a real user > before when I've been typing in comments and searches for strings of that > form, and had the server turn them into non-ASCII Unicode characters). I would think that both data loss (IE/Safari/Opera) and what you call "data corruption" (FF) are bad. As a matter of fact, the latter may be less harmful as servers can try first UTF-8, then document encoding (and I know some servers already do that). On the other hand, documenting something that is clearly broken seems to be the wrong approach to me, in particular as we have proof that there currently isn't any reliable interoperability for this edge case. It would be interesting to know how many pages out there contain characters in query parts of links that aren't part of the document encoding. Only these would be broken if the more sane FF approach would be used (and these pages may *already* are broken in FF as of today). BR, Julian
Received on Saturday, 28 June 2008 10:17:25 UTC