W3C home > Mailing lists > Public > public-qa-dev@w3.org > April 2004

Re: [markup validator] source quoting i18n bug?

From: olivier Thereaux <ot@w3.org>
Date: Mon, 26 Apr 2004 17:40:19 +0900
Message-Id: <5998D50C-975D-11D8-8A09-000A95E54002@w3.org>
Cc: QA Dev <public-qa-dev@w3.org>
To: Bjoern Hoehrmann <derhoermi@gmx.net>
[moving to qa-dev for semi gory details]

On Apr 25, 2004, at 2:16, Bjoern Hoehrmann wrote:
>> I am far from being an expert on that part of the code, but it seems
>> like a typical i18n problem.
>
> Yes, this is documented in the source (for truncate_line):
>
> [...]
>   # This *really* wants Perl 5.8.0 and it's improved UNICODE support.
>   # Byte semantics are in effect on all length(), substr(), etc. calls,
>   # so offsets will be wrong if there are multi-byte sequences prior to
>   # the column where the error is detected.
> [...]

My guess was right, I really should have looked a bit further... Thanks 
Bjoern for the confirmation.

The good news is that the new box waiting to be serving 
validator.w3.org has perl5.8, and so has our test server, so that 
should be a breeze to fix.

One thing I wonder, though...

check currently requires perl > 5.6, and perl 5.6 has "use utf8" (even 
though this is deprecated by 5.8's better support (TM) of UTF8).
Any reason we are not using that, and any reason we should not use that 
for users of perl < 5.8?

-- 
olivier

Received on Monday, 26 April 2004 04:43:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 19 August 2010 18:12:44 GMT