- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Tue, 8 May 2007 22:17:25 +0300 (EEST)
- To: www-validator@w3.org
On Tue, 8 May 2007, Andreas Prilop wrote: > I specifically mean (HTML 4) documents with only US-ASCII characters. They, too, should have their encoding declared. I can't really say "must" instead of "should", since the specifications are vague, but this also means that the meaning of a sequence of octets is formally left unspecified if it is purported to be HTML 4 but does not have its encoding declared (in an HTTP header or in a meta tag or, nominally, in a charset parameter of a referring link). In principle, its validity is undecidable since we don't even know how to interpret the octets. In practice, of course, browsers will do what you want and infer US-ASCII or some 8-bit encoding that contains US-ASCII as its subset. In principle, they could do otherwise; maybe even some browser running in an EBCDIC environment does that - for local documents. > If I'm not mistaken, it is still correct to send e-mail in US-ASCII > without any MIME header and charset declaration. Yes, because that's specified in the e-mail protocol. > How is that with HTML 4? Not specified. If you are thinking about a non-MIME e-mail message containing an HTML document, then I'm afraid we must formally treat the content as plain text, since that's what e-mail messages are by default. Is there some practical problem behind your question? -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Tuesday, 8 May 2007 19:17:35 UTC