W3C home > Mailing lists > Public > www-validator@w3.org > July 2006

Re: Error Message Feedback

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Sun, 23 Jul 2006 23:55:29 +0300 (EEST)
To: Alex Ivanov <alex@3dplumbing.net>
cc: www-validator@w3.org
Message-ID: <Pine.GSO.4.64.0607232348500.26737@korppi.cs.tut.fi>

On Sat, 22 Jul 2006, Alex Ivanov wrote:

> Quite often Markup Validation Service says something like:
> "Sorry, I am unable to validate this document because on line 479, 524  it 
> contained one or more bytes that I cannot interpret as us-ascii

The error message is rather self-explanatory. The reason for error can be 
more obscure, but you would need to specify the URL of the page as well as 
your way of using the validator (did you use "encoding override"?) to get 
help with that.

> But my pages don't contain any visible mistakes

Then they are apparently invisible errors.

> and that Linux Tidy says that 
> everything is OK.

Tidy is not a validator.

> But when I apply CSE HTML Validator Pro v7.0 tidy option, 
> your service validates my HTML code immediately.

"CSE HTML Validator" is a product that is sold as a validator, but it's 
not a validator. It may have an option that causes data to be changed in 
the sense that some characters are represented as entity references or as 
character references, so this might explain the problem.

> I'd like to know what the bytes that "can't be interpret as us-ascii" are 
> meant and how to fix this problem.

Well, you know the URL and we don't, so you are in a much better position 
to find that out. The general answer is that any octet with the most 
significant bit set, i.e. octets with values 128 to 255, have no 
interpretation as US-ASCII data. But did you actually mean to use 
US-ASCII? The URL might reveal that... Meanwhile, my conjecture is that 
your documents are actually meant to be encoded using some 8-bit encoding 
but the encoding has not been properly declared.

Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Sunday, 23 July 2006 20:55:40 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:58:57 UTC