Re: Should text/html be parsed as SGML or XML? from Masayasu Ishikawa on 2001-10-08 (www-validator@w3.org from October 2001)

From: Masayasu Ishikawa <mimasa@w3.org>
Date: Mon, 08 Oct 2001 16:06:33 +0900 (JST)
To: www-validator@w3.org
Message-Id: <20011008.160633.41638029.mimasa@w3.org>

Nick Kew <nick@webthing.com> wrote:

> I recollect reading some years ago in what I think was an official
> W3C spec (probably for HTML 3.2 or 4.0) that for back-compatibility,
> legacy documents should be parsed as HTML 2.0 in the absence of an
> FPI.  Am I going senile, or has this been completely abandoned?

Abandoned.  "B.1 Notes on invalid documents" of HTML 4 [1] says:

    The HTML 2.0 specification ([RFC1866]) observes that many HTML 2.0
    user agents assume that a document that does not begin with a document
    type declaration refers to the HTML 2.0 specification.  As experience
    shows that this is a poor assumption, the current specification does
    not recommend this behavior.

Meanwhile, IMHO, validators should parse such a document as an SGML
document rather than an XML document.

[1] http://www.w3.org/TR/html4/appendix/notes.html#h-B.1

Regards,
-- 
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium

Received on Monday, 8 October 2001 03:06:44 UTC