Re: Beta: Fatal Error: No DOCTYPE specified! from Terje Bless on 2002-10-25 (www-validator@w3.org from October 2002)

From: Terje Bless <link@pobox.com>
Date: Fri, 25 Oct 2002 18:04:24 +0200
To: W3C Validator <www-validator@w3.org>
cc: Bjoern Hoehrmann <derhoermi@gmx.net>
Message-ID: <a01060005-1021-6E8443D4E83311D6AC5400039300CF5C@[193.157.66.10]>
Bjoern Hoehrmann <derhoermi@gmx.net> wrote:

>* Terje Bless wrote:
>>These issues as a whole have been assigned bug number #7.
>
>Without the public beeing able to keep track of these bugs, it's quite
>useless to inform it on bug numbers...

The bug numbers will be referenced in release notes for future versions so
you'll be able to check whether your bug was addressed. The reason the bug
database (whose age is still counted in hours and not days, BTW!) is not
public right now is that it's something I set up with my left hand and my
eyes closed. IOW, I don't trust the box to stable, much less hardened
enough to expose it to the general public.

There is an action going to set up something more permanent, but there
needs to be a discussion about whether it should be public or not that
hasn't happened yet (in this context, "public" means "anyone can create and
change bugs" not "anyone can look at and query bugs").


>>>1) I don't see any good reason to refuse validation completly.
>>>   It's very simple to choose document types to default to
>>
>>This is a conscious choice;
>
>Conscious or not, it's a bad choice.

Ok, let me rephrase. "It's not a bug, we intended it to work that way."


>The validator should especially help those people, who are not
>standard-experts;

Agreed.


>those will get very frustrated using the validator if it always refuses
>to check their documents for no good reasons.

Ah, but the thing is, this happens to be, IMO, a _very_ good reason.


>Even if they take the advise, download the document from their server,
>modify the document type declaration, upload it again, and then validate
>it again, they will be confronted with the validator to refuse
>validation because of a missing encoding declaration. They may then
>again try to fix this problem. If they'll then be able to validate, they
>probably get dozens of error messages they don't understand. That's far
>away from what I would call usable.

Ok. But then lets try to restate the problem. The problem isn't that we
refuse to validate in the absence of a DOCTYPE, the problem is that the
current user interface and error messages aren't informative enough, or
they are confusing and/or obscure. Now how can we improve on that?

Perhaps we could improve the wording of the message? Point to a longer and
more explanatory text about the issue? Maybe we should try to detect all
these "fatal" errors in one go so you get told about both the DOCTYPE issue
and the character encoding issue the first time?

The obscure error messages are a very real problem, yes, and one that we'de
dearly like to improve on. One way you can help with that is by looking at
/docs/errors.html and writing new text for error messages we haven't
covered yet, and improving on the existing text (which in some cases is
rather out of date).


We spent a considerable amount of effort on dispensing with the need to
"guess" DOCTYPEs for a reason. For a lint it would be a very good thing,
but for something that aspires to "formal validator" status, it just
doesn't wash. Perhaps if it hadn't been tried and found to be an utter
unmaintainable mess you would have stood a chance at convincing me, but
right now I think it highly unlikely that this will change.

It certainly won't change for /this/ version (too big a change to do it
between a beta and release), but we'll be branching the next version as
soon as this one is out the door.


>>>2) the page should display the revalidate form
>>Agreed, but it may be tricky to implement.
>
>Not my fault :-)

Are you sure? I /need/ someone to blame it on. Maybe the dog did it? Yeah,
that's it. The dog did it! The mangy mongrel! :-)

I'll try to put this together, but it may have to wait for the next version
(I hope not, but...).


>>>3) It's "document type declaration", not "DOCTYPE declaration", please
>>>keep the terminology straight
>>
>>Actually, AFAICT, it's "Document Type Definition" and "DOCTYPE
>>Declaration".
>
>http://www.w3.org/TR/REC-xml#dt-doctype
>
>You won't find the term "DOCTYPE declaration" in that document.

Neither will you find a normative reference to SGML in that document...

But I have to admit I'm not certain about this (where is Arjun Ray when you
need him? *sigh*). I'll try to dig up a few SGML gurus and and put it to
them. But for now <URL:http://www.w3.org/TR/REC-xml#dt-doctype>:

  [Definition: The XML document type declaration contains or points to
  markup declarations that provide a grammar for a class of documents.
  This grammar is known as a document type definition, or DTD. The
  document type declaration can point to an external subset (a special
  kind of external entity) containing markup declarations, or can contain
  the markup declarations directly in an internal subset, or can do both.
  The DTD for a document consists of both subsets taken together.]

I suspect the term "document type declaration" as used here is a
bowdlerization of the orginal SGML terminology.

But again, I'm no expert on this so I could well be wrong.



>>>4) The phrase talks about a "first line", while the document type
>>>declaration in the example takes two lines
>>
>>Yeah, thanks, good catch! I'll fix it ASAP. How does "At the very
>>beginning of your document" sound to you?
>
>Wrong. Depending on the point of view you may have a BOM, an XML
>declaration, processing instructions, comments, etc. prior to the
>document type declaration. "Prior to the root element".

Didn't you just take me to task for the "unfriendlyness" of that message?
The phrase "prior to the root element" is friendly?!?!? :-)



>The only noteworthy undesireable thing connected to the XML declaration
>is that Microsoft Internet Explorer for Windows 6.0 switches back to
>bugwards-compatible processing of XHTML documents if they contain a XML
>declaration.

This is actually rather serious. We try to encourage tag-soup authors to
switch to standards compliant markup. A great deal of that battle is
convincing them that they can achieve predictable behaviour from browsers
with standards compliant markup. Explaining "standards mode" is hard
enough, if putting in even more "standards compliant markup" (i.e. the XML
Decl) makes browsers _less_ standards compliant.. That is very very bad!



>[...] I don't see a good reason to ignore XHTML 1.0's "Use both the
>lang and xml:lang attributes when specifying the language of an element"
>and thereby advertising "incompatible" markup.

Hmmm. Do me a favour and play devil's advocate for a moment. What are the
_downsides_ to including both?

-- 
> ...publicity rights, moral rights, and rights against unfair competition...
Well, you've got me there.   I have no idea what any of those have to do with
SGML. Next you'll be claiming that running NSGMLS constitutes an unauthorized
public performance of SGML.                                  -- Richard Tobin
Received on Friday, 25 October 2002 12:04:29 UTC