- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sat, 24 Apr 2004 19:16:35 +0200
- To: olivier Thereaux <ot@w3.org>
- Cc: validators community <www-validator@w3.org>, Martin Duerst <duerst@w3.org>
* olivier Thereaux wrote: >Typical test case: validating the validation output for a shift_jis >encoded page (in my case, the google.co.jp homepage) > >Symptom: in its error output, the validator quotes part of the source >for the validated page. >I am far from being an expert on that part of the code, but it seems >like a typical i18n problem. Yes, this is documented in the source (for truncate_line): [...] # This *really* wants Perl 5.8.0 and it's improved UNICODE support. # Byte semantics are in effect on all length(), substr(), etc. calls, # so offsets will be wrong if there are multi-byte sequences prior to # the column where the error is detected. [...] There are various means to tell Perl the result of iconv is UTF-8, see `perldoc perlunicode`/"Porting code from perl-5.6.X".
Received on Saturday, 24 April 2004 13:17:00 UTC