W3C home > Mailing lists > Public > www-archive@w3.org > April 2008

Re: patch for HTML::Encoding performance

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 09 Apr 2008 04:10:16 +0200
To: olivier Thereaux <ot@w3.org>
Message-Id: <4c9ov3lampiec5dbuel1b2nsehsse8dmti@hive.bjoern.hoehrmann.de>

* olivier Thereaux wrote:
> In a qa-dev thread about profiling, Ville found a few tricks to make
> HTML::Encoding much faster on some pathological cases. The thread
> includes patches. Would you look into them?
> http://lists.w3.org/Archives/Public/public-qa-dev/2008Apr/0007.html

Please feel free to forward this response to the list or archive it on
www-archive. I had a brief look at the patches and I am afraid I can't
use either. HTML::HeadParser seems to fail for HTML4 documents with an
object element in <head>, and it might change the behavior of the module
when multiple Content-Type meta elements are specified (though I didn't
check carefully). Looking just for the body start tag is likely the best
solution here, but the performance impact rather low, so I've not yet
done this. This problem has always been noted in the source though.

The decode-last-chunk-only.patch is incorrect, the purpose of doing it
this way is precisely to handle sequences that overlap two chunks in the
right way, forgetting the previous chunk would break that. It should
also be ineffective, since $data is modified by Encode::decode, but it
seems the wrapper broke that. I've fixed that bug instead. A new version
of the module is on its way to CPAN.
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Wednesday, 9 April 2008 18:33:19 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:43:19 UTC