- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Fri, 15 Feb 2008 17:32:08 +0100
- To: ietf-http-wg@w3.org
Roy T. Fielding wrote: > It looks like 1/2 of your response is about small changes to the > text that is being deleted, and another 1/4 about the bits left > after the last change, and only the last 1/4 about my proposed > rewrite. That is really confusing. Well, I was confused, I found your long "p3" text, and seeing that it went on and on thought that's your proposal, commenting inline as good as possible. And about an hour later I arrived at the cut where you started your real/new proposal. After that I marked the first part as "p3" keeping ">|" as quote indicator, and used ">:" for your real/new proposal. Whatever "p3" actually is, it is what you wrote, and took about 3/4 of my reply, yielding a complete proposal based on "p3". The last 1/4 was what you said above. >> For 2616bis that should be no valid option (MAY), it should be >> a *violation* of a new SHOULD for the stated historical reason. >> Going from MAY to SHOULD NOT is possible, nothing breaks. > That would change the protocol such that all currently compliant > HTTP senders that transmit text messages in "iso-8859-1" without > a charset parameter would be violating a SHOULD requirement. Yes, that is the point of getting rid of "default Latin-1", which was quite popular in this long "unknown text/* subtypes" thread, and before. IIRC Martin proposed to get rid of it back in 2006, admittedly that was about 2617bis, and at that time I screamed. The HTML5 WG apparently wants to replace Latin-1 by windows-1252 to some degree, I didn't check the details, only what I saw when trying to figure out interesting points in the "HTNL5 diff" draft. > My proposal states the fact that such messages do occur in > practice Sure, but that's not more desirable. As you explained earlier it used to be a kind of hack 15 years ago, and one of the early RFCs already said "we'll do Unicode a.s.a.p.". >>>: if the encoding can be determined within the first 16 octets >>>: of data and interpreted consistently thereafter. >> Please no arbitrary magic numbers like "16" in a standard, let >> alone in a standard where the complete "sniffing" business is >> off topic. > It is more important that it works (or that we find out it > doesn't). IMO it is a design principle, stay away from magic numbers unless you really must have them, and then folks will ask "why 16 if 512 is a much better buffer size ?" Which could be ignored as stupid question - unless it turns out that 512 is really better. HTTP in its role as "TP" doesn't need to sniff, so why specify it at all here ? Whatever browsers do, it has also has to work with other protocols (or other URI schemes to cover the file: case). >> (as you noted in another article) servers have no time for any >> sniffing on their side for dynamical content. But that does >> not justify a "variance" going as far as an option (MAY), >> violating a SHOULD NOT is good enough for this historical case. > Sorry, that decision was made in 1994 and is now way out of scope. It's 2008 now, the old browsers choking on an explicit charset did HTTP/1.0 without a Host: header field, they are gone. I tried to use "IBM Webexplorer" in this millennium a few times, they really are hopeless. A decision in 1994, before UTF-8 existed, is not necessarily good enough today. UTF-7 was published 1994, it was cute, but we all agree to deprecate it somehow for today. >> I don't see why 2616bis should try to overrule text/xml defaults >> with a MAY, as HTTP certainly does not try to tell clients what >> a say image/x-icon might be, and how to display it. > Then you don't know (or don't care) what the MIME specs say. I know that image/vnd.microsoft.icon is a registered MIME type, and I think that 2616bis doesn't need to talk about it. >> Plausible reasons why servers might intentionally lie with >> "iso-8859-1" do not belong in an Internet standard. If an UA is >> broken it needs to be fixed. Servers could also try their luck >> with the registered "unknown-8bit" instead of lying, this is out >> of scope for HTTP. > Then get back to us when you have fixed that user agent. If we are talking about IE6 among others, I get monthly fixes for security issues in this beast. If something in IE6 is so horribly wrong that it affects what 2616bis will say 2009 I'd like to know precisely what it is. What you propose boils down to "iso-8859-1" means "dunno", and it is required to disable some unclear (for me) bug in IE versions. Likewise no charset means "iso-8859-1" and it is required for old Mosaic derivates back in 1994. And servers wishing to announce a real Latin-1 text can pick what they like better, but actually no client (apart from IE and Mosaic with their own problems) can actually believe them. Do you see how *odd* that sounds, when I extract it from your 2119 prose ? Is "iso8859-1" = "dunno" really what you want in 2616bis ? If HTTP servers wish to use this hack, why can't they limit this effort depending on the User-Agent ? Frank
Received on Friday, 15 February 2008 16:30:52 UTC