- From: Koen Holtman <koen@win.tue.nl>
- Date: Sat, 2 Mar 1996 12:50:43 +0100 (MET)
- To: Nickolay Saukh <nms@nns.ru>
- Cc: koen@win.tue.nl, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Nickolay Saukh: > >4.2 Accept-Charset > >I think last sentence of first paragraph should be written as >"The ISO-8859-1 character set can be assumed to be acceptable >to all user agents.". There was a long discussion about ISO-8859-1 versus US-ASCII recently, and I must admit that I did not read all messages in that discussion. My impression at the end was that most people wanted US-ASCII to stay as the character set which can be assumed to be acceptable to all user agents. > Rationale: per HTTP/1.1 draft >(section 3.7.1) entity body without explicit charset can be >US-ASCII only or ISO-8859-1. Yes. >Thus any conforming user agent must >be able to handle ISO-8859-1. No, that is not a correct inference. It would make sense for every user agent to be able to handle the all entity bodies without explicit charset, but Section 3.7.1 does not require it. >4.6 Alternates > >Can media-type contain charset? Is this a valid exmaple? > >Alternates: {"TheProject.fr.html" 1.0 > {type "text/html"} {language "fr"}}, > {"TheProject.en.html" 1.0 > {type "text/html"} {language "en"}}, > {"TheProject.ru.html" 1.0 > {type "text/html;charset=iso-8859-5"} {language "ru"}} > ("/cgi-bin/xlate?koi8-r+TheProject.ru.html" 1.0 > {type "text/html;charset=koi8-r"} language "ru"}} Yes. Contrary to what Daniel DuBois said in this thread, {type "text/html;charset=iso-8859-5"} is indeed the way to denote the charset. This mirrors use of the Content-Type header, which specifies the MIME type and optionally the charset. Note that we do not have a Content-Charset header, but that we _do_ have an Accept-Charset header. I believe that this asymmetry was caused by early versions of HTTP trying to inherit as much semantics from the MIME specifications. As far as I know, it is too late to fix it now. Also, contrary to what Daniel DuBois said, > ("/cgi-bin/xlate?koi8-r+TheProject.ru.html" 1.0 > {type "text/html;charset=koi8-r"} language "ru"}} is a legal alternate description. What the anti-spoofing clause (the origin server restriction) in Section 5.2 of draft-holtman says is that origin servers may not return this alternate in a preemptive negotiation response. This means that, if this alternate is the best one, the origin server should send a reactive negotiation response, which causes the client to retrieve the best alternate with a direct request on /cgi-bin/xlate?koi8-r+TheProject.ru.html. >5.1 Reactive negotiation > >If two alternates are differ by charset only, how >specify preferred one? The service author can specify the preferred one using the source quality factors in the Alternates header: {"notpreferred.html" 0.9 {type "text/html;charset=iso-8859-5"}} {"preferred.html" 1.0 {type "text/html;charset=koi8-r"}} or by the order in which the alternates are listed: {"preferred.html" 1.0 {type "text/html;charset=koi8-r"}} {"notpreferred.html" 1.0 {type "text/html;charset=iso-8859-5"}} So it is up to the service author so decide for you which charset of the ones you accept would give you the best results. The decision made is reflected in the Alternates header. You, as a user agent user, can not express a preference for one charset over another, you can only say which ones you can handle. There are no quality factors in the Accept-Charset header. This means that the HTTP/1.1 draft spec assumes that if a user agent puts a charset in its Accept-Charset header, it can handle this charset perfectly, not just through some lossy on-the-fly filter. If anything lossy happens, it must be done at the server side, and be reflected in the Alternates header. I don't know if this assumption of being able to handle perfectly all charsets included in the Accept-Charset header is correct for all current browsers. If it is not, we would have to decide if a) the current browsers need to be improved, or b) the draft spec needs to be extended. I would go for a), though I realize that this puts browsers that don't use a bitmapped screen, like Lynx, in a difficult position. Koen.
Received on Saturday, 2 March 1996 03:54:21 UTC