[Bug 14284] Need HTML parser algorithm options from bugzilla@jessica.w3.org on 2011-10-04 (public-html-bugzilla@w3.org from October 2011)

From: <bugzilla@jessica.w3.org>
Date: Tue, 04 Oct 2011 00:02:21 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1RAsSj-0007uc-6P@jessica.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=14284

--- Comment #4 from Ian 'Hixie' Hickson <ian@hixie.ch> 2011-10-04 00:02:19 UTC ---
http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/0023.html

As far as I can tell, what would be needed is to say that the "encoding
sniffing algorithm" should use exactly 1024 bytes (or all of them, if that's
less), stalling without exception until either an encoding is found or 1024
bytes are processed, bypassing the optional heuristics and always defaulting to
UTF-8.

I suppose I could add a flag to that algorithm to enable that, though it would
make it even more complicated which isn't great.

I'm still not sure relying on this algorithm is a good idea at all, though.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Tuesday, 4 October 2011 00:02:30 UTC