- From: Frank Tiggelaar <frankti@xs4all.nl>
- Date: Mon, 10 Dec 2001 23:23:44 +0100
- To: "Peter K. Sheerin" <pete@petesguide.com>
- CC: webmaster@domovina.net, www-validator@w3.org, Martin Duerst <duerst@w3.org>
Peter, Thank you very much for your reply. Please allow me a couple of lines to describe the position we're in: Domovina Net has ex-Yugoslavia and the UN war crimes tribunal in The Hague as its main subjects; it is run entirely by volunteers (ranging from a painter to a radio engineer), on donated disk space on 12 servers located in four countries. We have no access to the webserver configs on nine out of twelve. We started out in 1994; the ever growing number of pages, the messy way they were laid out and the (poor) quality of our html made us decide in early 2000 to mould all pages into a single layout and to validate new and modified pages on the W3C website - which is what we did with roughly 7,000 pages. Until a month or so ago I was under the impression that a standard set by the W3C would 'always' remain a standard, i.e. that new magic in html/xml/css would bring about a new standard (HTML 4.2 or 5.0?) and not a change in an existing one. After seven years on line there are hundreds of links on the web to [specific pages] on Domovina. Making changes in extensions would create hundreds of broken links. Moreover, putting your suggested lines into .htaccess files will only work if we create extra directories in cases where English and EastEu pages now co-reside, and it will only work with Apaches. However, a copy of all files resides on my small ADSL-connected home network on which an IIS/IS offers basic search functionality on all 66K files. Other than that, parts of our website are integrated with other websites or have other websites reading parts of our pages. For example: our Tribunal Live/Uzivo service (offering the audio from the UN tribunal's Hague courtrooms) is integrated in www.un.org/icty to also offer their visitors 'our' live audio; we are presently setting up a RealServer for the court-audio in Belgrade; all this was designed so the whole service coulkd be shifted easily to RealNetwork servers when stream requests exceed those available to us. Short: if a page doesn't validate 'on its own' it will never validate [on the servers it may reside on]. I've looked into the option of writing (VB) scripts which would add the missing charset directive to the 'previously-valid' pages, but the problem there is that we still have 1000's of archive pages with dirty html. Writing reliable scripts to create 'valid html' is therefore well nigh impossible: 'good' and 'bad' are mixed in many directories and tags and directives in the old pages are highly unpredictable. Please accept that I am grateful for your suggestions and hints, but we see no option to solve this situation the way it should, without going over all the pages we already validated (or thought to have validated given the 'congratulations'-messages from the W3C Validator earlier on) I can motivate my team members to do that after all the time they already devoted to the first pass. Remains my criticism of W3C: their software validated all these pages and then, in the next version, had second thoughts. You are perhaps right about us having it wrong all the time - in light of the small print in the definitions - but were we wrong to assume that the standards organisation's webpages would offer us a reliable check of their own standards? After all, overlooking a missing directive is not a bug in the validator software, but a rather serious mis-implementation of one's own standards - if I take your remark "but the recent changes to the validator did not make your documents invalid--they already were" at face value. Because of that (what I consider to be a) glitch we're in trouble, or have spent weeks of our free time to no avail. That is frustrating, and I hope it explains the admittedly rather bitchy tone of my posting. My apologies to anyone in this ng whom I offended in any way. Yours faithfully Frank Tiggelaar Domovina Net Peter K. Sheerin wrote: > > Frank, > > I too, hate moving goalposts, but the recent changes to the validator did > not make your documents invalid--they already were. > > (Below you'll find a single line you can add to your server config that > should fix all your documents, saving you the trouble of changing all the > pages that you've got the validator logo on.) > > That said, I'm not sure the responses you've received have been quite > useful, and your initial message was also not helpful. Instead of asking why > this change was made, and asking for help understanding or fixing the > problem, you simply came to a conclusion that you should stop validating > your documents and remove the link to the validator. > > What someone on this list should have done first is to tell you how to > easily fix the charset problem--as Martin alludes to below. In fact, I > should have done this, because I had the same problem when I first started > validating my pages, and thus knew the answer. > > What you should do (wether or not you decide to continue use the validator), > is to add this line to your Apache config: > > AddDefaultCharset utf-8 > > Note that this can usually be done in an .htaccess file, if you don't have > permission to change the main server configuration file. If you need a > little finer control, namely the ability to specify different character sets > for different pages, then you can use something like this: > > AddType text/html;charset=utf-8 .html > AddType text/html;charset=iso-8859-1 .htm > AddType text/html;charset=ISO646-DE .de > > This would allow you to have your ".html" pages sent as UTF-8, your ".htm" > pages sent as ISO Latin-1, and your ".de" pages sent with the ISO German > character set, for instance. > > > At 05:49 01/12/03 -0500, Frank Tiggelaar wrote: > > >Over the past year we have taken great care to validate all new pages > > >and pages on our site which were changed in any way. We added the small > > >W3C logo to all of the pages we validated. Recently we found out that > > >none of the pages which validated some time ago are validated today - > > >suddenly 'character encoding' has become required. > > > > >Therefore we stopped validating our pages; we shall remove all 7,000 > > >little W3c-validated logos from our websites. > > > > It would have been much easier to add a line or so of directives > > to your Apache server setup. And that would also have improved > > worldwide access to and readability of your site. > >
Received on Monday, 10 December 2001 17:23:44 UTC