Re: [markup validator] source quoting i18n bug?

* olivier Thereaux wrote:
>Typical test case: validating the validation output for a shift_jis  
>encoded page (in my case, the google.co.jp homepage)
>
>Symptom: in its error output, the validator quotes part of the source  
>for the validated page.

>I am far from being an expert on that part of the code, but it seems  
>like a typical i18n problem.

Yes, this is documented in the source (for truncate_line):

[...]
  # This *really* wants Perl 5.8.0 and it's improved UNICODE support.
  # Byte semantics are in effect on all length(), substr(), etc. calls,
  # so offsets will be wrong if there are multi-byte sequences prior to
  # the column where the error is detected.
[...]

There are various means to tell Perl the result of iconv is UTF-8, see
`perldoc perlunicode`/"Porting code from perl-5.6.X".

Received on Saturday, 24 April 2004 13:17:00 UTC