Re: {bug?] fake dtd inside html comment beying used from Terje Bless on 1999-11-07 (www-validator@w3.org from November 1999)

From: Terje Bless <link@tss.no>
Date: Sun, 7 Nov 1999 16:28:29 +0100
To: W3C Validator <www-validator@w3.org>
Message-Id: <199911080326.EAA17953@vals.intramed.rito.no>

On 06.11.99 at 21:22, Terje Bless <link@tss.no> wrote:

>while looking over this code I found some significant (read:
>*significant*) potential for performance improvement

OK, scratch that. After actually implementing this, performance testing
reveals that my "new and improved" code is actually dog-slow compared to
Geralds original. I have some minor improvements that might be worthwhile
(mainly in terms of cleaner code and a 50% reduced memory consumption), but
nothing that'll make any significant difference.

That's what I get for second-guessing Gerald. :-(

In case anyone is curious, I tested a "worst case" file (8K lines with no
way to guess the DOCTYPE) with 100 iterations of just the code in question.
I came up with these times:

Timing 100 iterations:
    Original:  28 wallclock secs (27.72 usr +  0.00 sys = 27.72 CPU)
    Alternate: 41 wallclock secs (40.97 usr +  0.00 sys = 40.97 CPU)
    Final:     24 wallclock secs (24.44 usr +  0.01 sys = 24.45 CPU)

The "Original" is Geralds original code, "Alternate" is my oh-so-clever
idea, and "Final" is my more conservative changes to Geralds code. As you
can see, my first idea was almost twice as slow and my final code is
insignificantly faster. I haven't profiled it for memory consumption yet,
but I should have reduced it by at least 50% so at least a little good came
of this exercise.

I'm still looking for opinions on whether the DOCTYPE guessing code can be
scrapped in favour of a DOCTYPE-override on the FORM interface tho´!

Received on Sunday, 7 November 1999 22:26:49 UTC