- From: Nick Kew <nick@webthing.com>
- Date: Thu, 26 Jul 2001 09:08:23 +0100 (BST)
- To: Terje Bless <link@pobox.com>
- cc: W3C Validator <www-validator@w3.org>
On Thu, 26 Jul 2001, Terje Bless wrote: > On 25.07.01 at 14:03, Lloyd Wood <l.wood@eim.surrey.ac.uk> wrote: > > >I've always wondered how you define the charset for the line that defines > >the charset so that you can interpret it. > > For the HTTP header fields it's fairly simple; they're US-ASCII period. For > that bogosity called "META" the waters are substansially more muddy. > Especially since there aren't any clear rules for whether the charset in > the META element overrides the one in the HTTP header... Or vice versa... Surely that at least is clear: HTTP headers take precedence over <META bogus="hot-air">? Can't cite references OTTOMH (and not time to go looking just now), but ... > Or what this means for the case when the charset in the HTTP header is > there by inference (as a default, not explicitly)... But *ML rules don't apply to HTTP, so whence the conclusion that *anything* is implicit (as opposed to absent) in the headers? Sure, if we take the whole thing (*TP transmission + *ML document), then we can start to talk about undeclared charsets being implicit. > "I'm sorry, but that Document Type is not in my Catalog. I cannot Validate > this document" We are happy with SYSTEM FPIs. It's the No FPI case (or FPIs which are not accessible to the validator) you need to complain about. > and "I'm sorry, but that Character Encoding is not in my > database. I cannot Validate this document." Hmmm .. Would it not be fair to say US-ASCII is a subset of every other encoding that might be considered as a sefault (certainly iso-8859-1 and utf-8)? so that a document that validates to it should always be fine? > or "I'm sorry, but I was unable > to determine the Character Encoding based on available information. Please > make your Character Encoding explicit in the HTTP headers". Except if HTTP happens to be FTP or file upload, and there is no header... > To "assume nothing" in this context means that if we cannot get a clear, > unambigius, indication, we abort instead of guessing or, in this case, > instead of interpreting the internally inconsistent specifications (that's > the HTML-WG's job ;D). Have you never had to do someone elses job because they made too much of a hash of it? Not that I'm saying that's relevant here, but in general. -- Nick Kew
Received on Thursday, 26 July 2001 04:08:37 UTC