Editorial: "algorithm for extracting an encoding from a Content-Type" from Philip Taylor on 2008-03-05 (public-html@w3.org from March 2008)

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Wed, 05 Mar 2008 16:33:03 +0000
To: HTML WG <public-html@w3.org>
Message-ID: <47CECB3F.6@cam.ac.uk>

In the "algorithm for extracting an encoding from a Content-Type":

"Skip any U+0009, U+000A, U+000B, U+000C, U+000D, or U+0020 characters 
that immediately follow the word equals sign (there might not be any)." 
- s/word//

"If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 
QUOTATION MARK ('"') in s: Return string between the two quotation 
marks." - unclear since there can be multiple later quotation marks 
(e.g. <meta content=';charset="utf-8"oops"'>), so it should be explicit 
that it means the earliest one. Same about apostrophes.

-- 
Philip Taylor
pjt47@cam.ac.uk

Received on Wednesday, 5 March 2008 16:33:30 UTC