Re: Validator errors from Dan Connolly on 2000-01-31 (www-validator@w3.org from January 2000)

From: Dan Connolly <connolly@w3.org>
Date: Mon, 31 Jan 2000 00:09:08 -0500 (EST)
To: www-validator@w3.org
Message-ID: <3895188F.4DE96C71@w3.org>
> Message-Id: <200001302255.XAA20439@vals.intramed.rito.no>
> Date: Sun, 30 Jan 2000 23:58:03 +0100
> From: Terje Bless <link@tss.no>

> On 30.01.00 at 13:22, Kynn Bartlett <kynn@idyllmtn.com> wrote:
> 
> >2.  Gerald needs to rethink the utility of having the default be
> >     XHTML 1.0.  While I can see -why- he'd choose this -- the W3C
> 
> Oh Crap!
> 
> This explains the rush of weird error reports the last couple of days. :-(
> 
> You can't make XHTML the default for documents without a DOCTYPE; it'll
> break just about anything out there.

Er... you mean it'll start complaining that everything out there
is broken, no?

> I thought the idea of serving XHTML as
> text/html was pure idiocy to start with, but if you start assuming it's
> XHTML in the validator you've thoroughly broken backwards compatibility.

Was doctype-sniffing a documented feature of the validator? If so,
I think Gerald's idea makes sense:

	"I'm assuming XHTML; if you don't want that, here's info on adding
	an HTML doctype..."

If you're talking about backwards compatibility with HTML specs, none
was promised for documents with no <!DOCTYPE...>:

"authors must include one of the
         following document type declarations in their documents"
	--
http://www.w3.org/TR/1999/REC-html401-19991224/struct/global.html#h-7.2

"The HTML 2.0 specification ([RFC1866]) observes that many HTML 2.0 user
         agents assume that a document that does not begin with a
document type
         declaration refers to the HTML 2.0 specification. As experience
shows that
         this is a poor assumption, the current specification does not
recommend
         this behavior."
--
http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.1

(those words are unchanged since the Dec '97 version of the spec
http://www.w3.org/TR/REC-html40-971218/ )

"3.3. HTML Public Text Identifiers

   To identify information as an HTML document conforming to this
   specification, each document must start with one of the following
   document type declarations. ..."
	-- http://www.ietf.org/rfc/rfc1866.txt


Earlier in this thread, Kynn Bartlett wrote:

>>1.  Your page does not specify a DTD with a doctype statement.
>>     This means that your level/flavor of HTML is undefined.

No, that means the document doesn't conform to any of the HTML 2.0,
HTML 3.2, HTML 4.0, nor HTML 4.01 specs, and it's not a strictly
conforming XHTML document. The validator is testing to see
if it's an XML document with the XHTML namespace.

>>  It's
>>     clear from your comments that you want HTML 4.01, but you don't
>>     say that, so the Validator (and any other user agent) is free
>>     to simply "guess."

User agents are free to do anything with documents that don't conform
to the spec. The validator was likewise free, but I sure hope
it didn't award any prizes for documents with no <!DOCTYPE...>.


> The only way to handle this that won't break badly is to assume that
> text/xml is XML, text/xhtml is XHTML, text/html is HTML 4.01[0], unless a
> DOCTYPE is given in which case the DOCTYPE is used.

Er... would you please support that claim with some evidence or
an argument? I find that XHTML served up as text/html works quite
nicely; e.g.

	http://www.w3.org/Protocols/rfc2616/rfc2616.html

XHTML is the only HTML dialect where a <!DOCTYPE...> isn't required,
so it makes perfect sense to check for XHTML when you don't see one.

> I was afraid this was due to bugs in my DOCTYPE guessing code,

The whole idea of DOCTYPE guessing was pretty goofy, if you ask
me. It just seems to encourage folks to put documents on
the web that don't match the specs, and there's plenty of tools to
helpyou do that without adding the validator to the list ;-)

-- 
Dan Connolly
http://www.w3.org/People/Connolly/
Received on Monday, 31 January 2000 00:10:49 UTC