- From: Robert Sayre <rsayre@mozilla.com>
- Date: Wed, 13 Feb 2008 17:01:04 -0500
- To: Roy T. Fielding <fielding@gbiv.com>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>, Julian Reschke <julian.reschke@gmx.de>, Geoffrey Sneddon <foolistbar@googlemail.com>, Mark Nottingham <mnot@mnot.net>
On Feb 12, 2008, at 4:12 PM, Roy T. Fielding wrote: > > the Web consists of dozens of different charsets, > most of which are left unlabeled because there is no commonly accepted > way of indicating charsets in filename metadata (and no real need to > anyway, since user agents will either sniff the content anyway or just > assume everything is in the fixed local charset known by the tool). > Fully agree. > Servers, OTOH, send text/* content with the assumption that it will be > treated as iso-8859-1 (or at least some safe superset of US-ASCII). Somewhat disagree. I think many servers assume that UAs will sniff, and deal with the issue for them. > > Servers don't sniff content because they can't -- it is impossible to > look at every byte of a page while handling 7,000 reqs/sec, let alone > the 20,000 reqs/sec that a decently tuned server can handle. In > addition, > some servers (particularly when serving dynamic content) will add a > charset parameter to unlabeled text/html content based upon how they > have > been configured to scan for cross-site scripting. They do so > specifically > because of known bugs in browsers that sniff the content for bizarre > charsets that bypass the resource's security assumptions and > cause the browser's user to fall victim to stupid XSS attacks. I know some cases of this attack, but I would appreciate more detailed references on these if you have them. > > That allows HTTP/1.1 compliant serving today to remain compliant > after the change, and addresses all of the interoperability issues > in regard to mislabeled content without ignoring the fact that the > main reason they are mislabeled today is to work around existing > bugs. For all other cases, the charset can and should be labeled > correctly. I agree with your conclusion, but I'm fuzzy on the spec text it would lead to. Have specific wording in mind? - Rob
Received on Wednesday, 13 February 2008 22:01:25 UTC