- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Thu, 20 Oct 2005 11:04:03 +0300 (EEST)
- To: Meg Crockett <meg_crockett@yahoo.com>
- Cc: www-validator@w3.org
On Wed, 19 Oct 2005, Meg Crockett wrote: > I can only get a tentative because it cannot > find character encoding. The message about "tentatively valid" is very confusing, and also incorrect. A more appropriate wording would be that the document is valid when interpreted in the encoding such-and-such. The fact that the encoding should be specified for a document on the Web is external to the question of validation. When the encoding has not been specified, the validator cannot know whether it has correctly interpreted the task given to it. I have no idea why the validator thinks it does not know the encoding of an _XHTML_ document, or more exactly an XML document, since for XML, the encoding is defined by the specifications anyway. (It might not be the encoding that the author really meant, but that's a different story.) For good old HTML, or SGML, the situation is different. > But I simply cannot > understand your documentation. I don't want long > descriptions, or slide shows. I want about 3 examples > which are exactly how they should appear in the code > including brackets. Then you have really misunderstood the documentation. The FAQ entry at http://validator.w3.org/docs/help.html#faq-charset tries to tell you that this is _not_ a matter of adding some tags into your document. The correct way to handle the issue is to make the server (or, in the case of validation by file submit, the browser) specify the encoding, and this is inevitably a server-specific issue. The answer might be very simple (it usually is), or somewhat complicated, or even in the negative (you cannot do it because the server admin prevents it). I'm pretty sure that the documentation intentionally avoids saying the following. I can understand the reasons behind this, and I mostly agree. But people will find this information anyway, so maybe it would be better to include it, since then you could accompany it with warnings and caveats. Add the following to your <head> element: <meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1"> Replace iso-8859-1 by whichever encoding you are using. Replace ">" by " /"> if (and only if) you declare an XHTML document type; in that case, also start your document with <?xml version="1.0" encoding="iso-8859-1" ?> > Then I'd like about 2 other > examples of how you might slightly modify these if > your situation were different than the first three. > Then a table to use to find all the likely differences > so you could tell what to put in if you live in outer > Mongolia and have some obsure system, or whatever > other considerations one needs to include. Well, you need to know the encoding you are using. The validator cannot really you such things; _you_ need to tell _it_ the encoding. If you need help with knowing what encoding your authoring tool produces, then you could look at its documentation. > I can't help but think that more people would validate > their pages if this character encoding lack of > documentation were not such a huge hurtle. I doubt that. Most of the commonly used authoring tools actually spit out a <meta> tag that makes the validator happy, though the information in it could actually be wrong sometimes. > I cannot > believe character encoding is really that complicated. It's actually much more complicated. However, once you've found out which encoding you are using and how to check that your server sends the right information about it, it's very simple. The problem is that the simple answers to these questions vary by the author's situation. -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Thursday, 20 October 2005 08:04:09 UTC