Re: Text::Iconv 1.4, new Validator bundle?

* Martin Duerst wrote:
>>I was under the impression that we agreed that using Encode and
>>proper Perl Unicode features were not planned for 0.7.0 which will
>>be the next version of the Markup Validator.
>
>Who agreed? You suggested to use proper Perl Unicode, didn't you?

I also suggested that we release often; the 0.6.7 release is now three
months old and it does not seem to me that the next version will be
released in October. Currently our main focus is on stabilizing the code
in HEAD which is the result of merging the improvements in the former
HEAD and 0.6.7, fixing all the bugs so that it has at least the level of
quality that 0.6.7 had and then see what comes next, I would expect a
Beta release to get broader review. I see switching to Unicode internals
now as making that more difficult.

>A lot of things would be better with a test suite. But I'm
>not ready to wait for one.

You don't have to wait! You can contribute to it and make it happen
sooner! Valuable contributions would be ideas, test documents,
documentation, source code for a test module and/or script, reports
on bugs in the current code, etc.

>>   % perl -MEncode -e "print decode 'utf-16be', qq(\x00\xf6)"
>>   Unknown encoding 'utf-16be' at -e line 1
>>
>>using the Encode.pm that ships with Perl 5.8.2 even though the
>>encoding would be supported if written as "UTF-16BE".
>
>Good to know. Does this apply to all encodings, or only to
>a few?

Only to a few as far as I can tell. A list of encoding names (including
different spellings) we currently support and which we would support
just by using Encode and/or Encode::Alias and/or I18N::Charset would be
very useful. Maybe that's something you can look into?

>>and check which behavior we desire, and have tests so
>>that later changes do not introduce bugs. Iconv and Encode also do
>>not support the same set of character encodings, GB18030 for example
>>is supported by the current Markup Validator but not by the Encode
>>version that ships with Perl 5.8.2, we would first need to figure
>>out for which encodings we would need to drop support or find other
>>replacements.
>
>Or we would just (temporarily) drop those that are not supported.

That's an option, too. Maybe we should discuss this in one of our
upcoming meetings?

Received on Friday, 24 September 2004 18:11:00 UTC