Validator bugs/Improvements

From: Martin J. Duerst (
Date: Wed, Sep 22 1999

Message-Id: <>
Date: Wed, 22 Sep 1999 17:12:02 +0900
From: "Martin J. Duerst" <>
Cc: (Mark Davis),,
Subject: Validator bugs/Improvements

Hello Gerald,

Here are some proposals for improving our validator.

The UTF-8 problem is noted on
and has been reported earlier. It is rather urgent, as we plan
to publish a WD in UTF-8 soon, and also because accepting XHTML
(as XML) requires that UTF-8 and UTF-16 are accepted.

I think the easiest way to fix this is to upgrade to SP 1.3,
which does most of this stuff internally; Takuya's code would
then just be used to set the right parameter.

> Date: Mon, 20 Sep 1999 06:54:53 -0700
> From: Mark Davis <>
> X-Mailer: Mozilla 4.6 [en] (Win98; U)
> X-Accept-Language: en,de-CH,fr-CH,it
> To: Martin Duerst <>, Mark Davis <>
> CC: Misha Wolf <>
> Subject: Validator bug
> X-UIDL: dc29267a3cebe94223915604c1bd0fe7
> Martin,
> I was using validator recently, and again found that it did not accept
> UTF-8 characters. Can you ask the person responsible to fix that?
> Also, there are two other items it would be good to add:
> 1. As a practical matter, the validator should put out a warning if the
> page does not have a charset tag. While the explicit charset is not
> required, essentially almost every browser is set to a default charset
> which is not Latin-1. While for Western Europe that is 1252 - which is
> pretty harmless - other settings will often misinterpret the contents of
> the page.

This is indeed very true. In the warning, you could for example point to,
which I just updated a bit.

> 2. The HTML that is actually checked is different than the source;
> various stuff is inserted or changed (like the DOCTYPE). While that is
> ok, it does cause the line numbering to be wrong, and can cause
> confusion when the line it says needs to be fixed does not resemble the
> source. It would be  good to:
> a. Not include inserted lines in the line count.
> b. Color the insertions so people know a change is made.
> Mark

Regards,   Martin.

#-#-#  Martin J. Du"rst, World Wide Web Consortium