- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Thu, 22 Nov 2007 14:07:00 +0200
- To: "public-html@w3.org Tracking WG" <public-html@w3.org>
Currently, the accept-charset attribute on <form> is defined by reference to HTML4. What should conformance checkers do? Is leading and trailing whitespace allowed? Leading or trailing commas hopefully aren't. The separators for the charset names must be at least one character long and contain zero or more space characters and at most one comma, right? Charset names should presumably match the mime-charset production from RFC 2978. Should the names also be checked against the IANA list of encodings? Or a shorter list of encodings that actually work? Are non- preferred IANA names errors? Opinion comment: Allowing comma in the separator is a design bug, IMO, considering that the general design pattern for separating spaceless tokens is to separate with whitespace. It is probably not worthwhile to make the commas non-conforming, though. Also, considering that UTF-8 ends the need to keeps the list of character encodings extensible, I think it would make sense to define a closed list of known to work legacy encoding encodings (plus UTF-8) to check against. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Thursday, 22 November 2007 12:07:41 UTC