[Bug 1833] Wrong ISO-8859-1 enconding behaviour on "Direct input"

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1833





------- Additional Comments From ot@w3.org  2005-10-24 06:19 -------
(In reply to comment #13)
> I don't understand how they are equivalent for the end user.  If I upload a file without a meta tag and 
try 
> to validate, I will see a warning about Character Encoding.
> 
> if I paste in the *exact same file* and submit using the form, then I don't see this warning.

This comment is certainly showing a problem in the user interface. However, it is not exactly relevant to 
this bug. This bug was about how the data sent to the direct input is handled internally by the validator, 
not about the (inevitable, see below) slight differences in messages between the different input 
methods.

When you validate by URL, the validator has a number of sources from which to draw the character 
encoding:
- the HTTP Content-type header
- (if XHTML) the XML processing instructions
- the meta http-equiv element

if none of these sources gave any satisfying result, the validator throws a warning, and tries to validate 
with a default character encoding.

Same goes for the file upload interface, except that there is no "server" serving the document.

For the direct input interface, there is no "document" being served, just a string of text being pasted in 
a form on a page in utf-8, therefore the text is pasted as utf-8, and parsed as utf-8. Since there is no 
need for character encoding detection, there will be no warning about the lack of character encoding 
information.

Perhaps there should be, with all validation results for direct input and file upload, a note explaining: 
"Your document has been validated using the validator's direct input / file upload interface. When 
online, this document will need to declare its character encoding. You can set up the declaration of 
character encoding by either setting up the Web server properly, or by declaring the character encoding 
within each documents. (read more). We encourage you to check the document again when it is online."

Received on Monday, 24 October 2005 06:19:36 UTC