W3C home > Mailing lists > Public > www-validator@w3.org > June 2012

Re: Export Options

From: Michael[tm] Smith <mike@w3.org>
Date: Thu, 7 Jun 2012 17:40:10 +0900
To: John Albaugh <JohnA@vdsys.com>
Cc: www-validator@w3.org
Message-ID: <20120607084009.GY349@sideshowbarker>
John Albaugh <JohnA@vdsys.com>, 2012-06-06 09:57 -0400:

> Thank you for that link Michael.  
> Unfortunately the error(s) and count(s) are different from the
> webservice and the validator.  Using www.vdsys.com as an example, I only
> get 4 errors from the webservice
> http://validator.w3.org/nu/?doc=http://www.vdsys.com&out=text and it
> looks like it had to stop whereas the validator found 16 errors and 8
> warnings.

http://validator.w3.org/check?uri=http://www.vdsys.com is what shows 16
errors and 8 warnings.

http://validator.w3.org/nu/?doc=http://www.vdsys.com shows only 4 errors.

Those are two different services, with different backends and parsers.

The behavior of the parser used by the http://validator.w3.org/nu service
closely matches actual browser behavior: It expects a character encoding to
either be declared in the Content-Type header or in a meta element in the
first 1024 bytes of the document, and if it doesn't find one, it assumes
windows-1252 as the default encoding (which matches what most browsers do
for most cases, except that browsers also further try to determine a
default encoding based on the user's locale). But then when it finally gets
to the meta element at byte 1800 or whatever of that document, it gives up
and quits parsing it any further -- because your document is basically
asking it to switch to a different encoding.

You can fix the document by putting the meta element with the charset
declaration closer to the top of the document. Or you can explicitly tell
the validator to use a particular encoding and not look for a
character-encoding declaration at all. If you do that for this document
you'll see it parses the whole document and reports a lot more errors:


Michael[tm] Smith http://people.w3.org/mike
Received on Thursday, 7 June 2012 08:40:15 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:59:27 UTC