- From: Chris Croome <chris@webarchitects.co.uk>
- Date: Tue, 9 Jul 2002 14:25:52 +0100
- To: www-international@w3.org
Hi On Tue 09-Jul-2002 at 02:05:10 +0200, Chris Lilley wrote: > > For the Gujarati page > http://www.laptopchallenge.org.uk/gu/ > > some of the problems might be that the XHTML page is not well formed. Oops, thanks for spotting that, it should be fixed now. > I hypothesised that this would trigger IE to send it to the > traditional HTML parser, which fails to realise that it is UTF-8 and > displays it as Latin-1. Really, I didn't realise that IE ignored the charset HTTP headers for invalid XHTML documents :-( > On the other hand the Punjabi page is well formed and valid > http://www.laptopchallenge.org.uk/pa/ > > and IE6 on WinXP still thinks it is Latin-1. Probably a meta element > with a charset would tell IE what to use. > > Aha! Not serving the pages as latin-1 would also help: > > [clilley@tux]$ telnet www.laptopchallenge.org.uk 80 > Trying 195.10.230.121... > Connected to www.laptopchallenge.org.uk. > Escape character is '^]'. > HEAD /pa/ HTTP/1.0 > > HTTP/1.1 302 Found > Date: Tue, 09 Jul 2002 12:02:58 GMT > Server: Apache/1.3.26 (Unix) mod_perl/1.27 mod_gzip/1.3.19.1a > Location: http://webarch.net/pa/ > Connection: close > Content-Type: text/html; charset=iso-8859-1 > > Connection closed by foreign host. > > clilley@tux clilley]$ telnet webarch.net 80 > Trying 195.10.230.121... > Connected to webarch.net. > Escape character is '^]'. > HEAD /pa/ HTTP/1.0 > > HTTP/1.1 302 Found > Date: Tue, 09 Jul 2002 12:04:59 GMT > Server: Apache/1.3.26 (Unix) mod_perl/1.27 mod_gzip/1.3.19.1a > Location: http://webarch.net/pa/ > Connection: close > Content-Type: text/html; charset=iso-8859-1 > > Connection closed by foreign host. No, the page _is_ served as UTF-8 the problem you had above (I _think_) is that you tried with HTTP 1.0 not 1.1 -- there is not 1 IP address per domain name on that web server, the iso-8859-1 page is a Apache generated 302 document. Try with lynx: [chris@snowball chris]$ lynx -head -dump http://www.laptopchallenge.org.uk/pa/ HTTP/1.1 200 OK Date: Tue, 09 Jul 2002 13:19:04 GMT Server: Apache/1.3.26 (Unix) mod_perl/1.27 mod_gzip/1.3.19.1a Content-Language: pa Last-Modified: Tue, 09 Jul 2002 13:19:04 GMT Content-Length: 16188 Connection: close Content-Type: text/html; charset=UTF-8 Actually you get the _whole_ document rather than just the HEAD but this is due to a problem with mod_perl's Apache::Registery handler. Or try with Apache benchmark: [chris@snowball chris]$ /usr/local/apache/bin/ab -v 4 http://www.laptopchallenge.org.uk/pa/ This is ApacheBench, Version 1.3d <$Revision: 1.59 $> apache-1.3 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/ Benchmarking www.laptopchallenge.org.uk (be patient)...INFO: POST header == --- GET /pa/ HTTP/1.0 User-Agent: ApacheBench/1.3d Host: www.laptopchallenge.org.uk Accept: */* --- LOG: header received: HTTP/1.1 200 OK Date: Tue, 09 Jul 2002 13:21:44 GMT Server: Apache/1.3.26 (Unix) mod_perl/1.27 mod_gzip/1.3.19.1a Content-Language: pa Last-Modified: Tue, 09 Jul 2002 13:19:04 GMT Content-Length: 16188 Connection: close Content-Type: text/html; charset=UTF-8 Or telnet with HTTP 1.1: [chris@snowball chris]$ telnet www.laptopchallenge.org.uk 80 Trying 195.10.230.121... Connected to www.laptopchallenge.org.uk. Escape character is '^]'. GET /pa/ HTTP/1.1 Host: www.laptopchallenge.org.uk HTTP/1.1 200 OK Date: Tue, 09 Jul 2002 13:24:03 GMT Server: Apache/1.3.26 (Unix) mod_perl/1.27 mod_gzip/1.3.19.1a Content-Language: pa Last-Modified: Tue, 09 Jul 2002 13:19:04 GMT Content-Length: 16188 Content-Type: text/html; charset=UTF-8 Chris -- Chris Croome <chris@webarchitects.co.uk> web design http://www.webarchitects.co.uk/ web content management http://mkdoc.com/ everything else http://chris.croome.net/
Received on Tuesday, 9 July 2002 09:25:54 UTC