- From: Terje Bless <link@pobox.com>
- Date: Thu, 26 Jul 2001 11:36:12 +0200
- To: Nick Kew <nick@webthing.com>
- cc: W3C Validator <www-validator@w3.org>
On 26.07.01 at 09:08, Nick Kew <nick@webthing.com> wrote: >Surely that at least is clear: [HTTP] takes precedence over [META]? Nope. HTTP 1.1 doesn't mention META, and HTML just sez it's supposed to be read by _servers_ to initialize the HTTP header... :-( >>Or what this means for the case when the charset in the HTTP header is >>there by inference (as a default, not explicitly)... > >But *ML rules don't apply to HTTP, so whence the conclusion that >*anything* is implicit (as opposed to absent) in the headers? The lack of a "charset" parameter on the HTTP 1.1 "Content-Type" header field means that you should assume it is there with a value of "ISO-889-1" according to the HTTP 1.1 RFC. HTML doesn't specify a default (it actually discourages it). But if HTTP overrides META, and the HTTP charset is only there by default, does HTTP's default still override an explicitly inserted META? That is, if the META sez EUC-JP and HTTP implicitly defines ISO-8859-1 (by being absent), does that really mean that we should use ISO-8859-1 (which the user obviously does _not_ want) over EUC-JP (which s/he _does_ want)? >>"I'm sorry, but that Document Type is not in my Catalog. I cannot >>Validate this document" > >We are happy with SYSTEM FPIs. It's the No FPI case (or FPIs which >are not accessible to the validator) you need to complain about. A DOCTYPE Declaration referencing an External Subset by Formal Public Identifier not in out Catalog, and without a System Identifier, should generate the error message above. An Internal Subset, an External Subset referenced by a FPI that is in our Catalog, or an External Subset giving a resolvable System Identifier, should all generate normal validation results. >>and "I'm sorry, but that Character Encoding is not in my >>database. I cannot Validate this document." > >Would it not be fair to say US-ASCII is a subset of every other encoding >that might be considered as a sefault (certainly iso-8859-1 and utf-8)? >so that a document that validates to it should always be fine? This is again very much Western thinking. US-ASCII is a subset only of common Western encodings. This means the answer to your question depends on whether you accept the validity of these "defaulted" charset parameters. I must admit to being both uncertain and ambivalent on this issue. >>or "I'm sorry, but I was unable to determine the Character Encoding based >>on available information. Please make your Character Encoding explicit in >>the HTTP headers". > >Except if HTTP happens to be FTP or file upload, and there is no header... Or a fragment pasted into the form (not finished yet)... Or... It must be dealt with, but these are sufficiantly fringe cases that we can add exceptions for those. I think... :-) What does Site Valet do?
Received on Thursday, 26 July 2001 05:48:10 UTC