- From: Richard Ishida <ishida@w3.org>
- Date: Fri, 28 Feb 2014 16:27:06 +0000
- To: Henri Sivonen <hsivonen@hsivonen.fi>, "www-international@w3.org" <www-international@w3.org>
On 28/02/2014 15:03, Henri Sivonen wrote: > As written, the Quick Answer is misleading if you only read that part > and skip the Details. The Quick Answer says "If you have access to the > server settings, you should also consider whether it makes sense to > use the HTTP header." Instead, it should emphasize that HTTP overrides > <meta>, so if you don't have access to the server settings and the > server is sending a charset parameter in the Content-Type header, the > Quick Answer won't work for you. Good point. I added something to that effect. > > The document links to http://www.w3.org/International/O-HTTP-charset > which doesn't cover nginx configuration. nginx behavior is worth > mentioning, since nginx configuration is a bit surprising: You have to > use the charset directive and can't use add_header, because the latter > appends *another* Content-Type header and, therefore, must not be used > to attempts to refine headers that nginx already adds by other means. > > Back to qa-html-encoding-declarations-new: > The document says: "Intermediate servers that transcode the data (ie. > convert to a different encoding) sometimes take advantage of this to > change the encoding of a document before sending it on to small > devices that only recognize a few encodings. Because the HTTP header > information has precedence over any in-document declaration, > transcoders typically do not change the internal encoding > declarations, just the document encoding and the declaration in the > HTTP headers." > > Is there documented proof that that's actually true? I will look into this further. Certainly that text is very old. > "User agents can easily find the character encoding information when > it is sent in the HTTP header." > > I suggest saying that they find it sooner. Any non-bogus user agent > has to be able to handle the level of difficulty of finding it in > <meta>. Done. > I think the section "Working with polyglot and XML formats", if > retained at all, should go under "Obscure details you should not need > to know". Noted. > Please delete "It is possible to invent your own encoding names > preceded by x-, but this is not usually a good idea since it limits > interoperability." It has no relevance to authoring documents that > will be viewed in Web browsers. I put this in there because I have several times come across people who wanted to do this, and I want to tell them not to. I've reworded it as a strong prohibition. > > The section "The charset attribute on a link" fails to mention that if > browsers supported the attribute (without special additional rules), > it would be an XSS attack vector, which is a good reason not to > support it. Added. > The document also links to > http://www.w3.org/International/questions/qa-choosing-encodings . > While that document correctly advises against the use of ISO-2022-*, > HZ, etc., it fails to warn about interoperability problems between > EUC-JP implementations on one hand and Big5 implementations on the > other. I.e. authors are safer also avoiding EUC-JP and Big5 (including > and especially Big5-HKSCS). Yes, that's another article you'll see an update for soon. I plan to try and incorporate the information you put in another email recently. Thanks for the comments. RI
Received on Friday, 28 February 2014 16:27:35 UTC