W3C home > Mailing lists > Public > www-validator@w3.org > December 2001

Re: Thanks a lot

From: Frank Tiggelaar <frankti@xs4all.nl>
Date: Mon, 10 Dec 2001 23:23:44 +0100
Message-ID: <3C1535F0.7CA65E21@xs4all.nl>
To: "Peter K. Sheerin" <pete@petesguide.com>
CC: webmaster@domovina.net, www-validator@w3.org, Martin Duerst <duerst@w3.org>
Peter,

Thank you very much for your reply. Please allow me a couple of lines to
describe the position we're in: Domovina Net has ex-Yugoslavia and the
UN war crimes tribunal in The Hague as its main subjects; it is run
entirely by volunteers (ranging from a painter to a radio engineer), on
donated disk space on 12 servers located in four countries. We have no
access to the webserver configs on nine out of twelve. 

We started out in 1994; the ever growing number of pages, the messy way
they were laid out and the (poor) quality of our html made us decide in
early 2000 to mould all pages into a single layout and to validate new
and modified pages on the W3C website - which is what we did with
roughly 7,000 pages.

Until a month or so ago I was under the impression that a standard set
by the W3C would 'always' remain a standard, i.e. that new magic in
html/xml/css would bring about a new standard (HTML 4.2 or 5.0?) and not
a change in an existing one.  

After seven years on line there are hundreds of links on the web to
[specific pages] on Domovina. Making changes in extensions would create
hundreds of broken links. Moreover, putting your suggested lines into
.htaccess files will only work if we create extra directories in cases
where English and EastEu pages now co-reside, and it will only work with
Apaches. However, a copy of all files resides on my small ADSL-connected
home network on which an IIS/IS offers basic search functionality on all
66K files.

Other than that, parts of our website are integrated with other websites
or have other websites reading parts of our pages. For example: our
Tribunal Live/Uzivo service (offering the audio from the UN tribunal's
Hague courtrooms) is integrated in www.un.org/icty to also offer their
visitors 'our' live audio; we are presently setting up a RealServer for
the court-audio in Belgrade; all this was designed so the whole service
coulkd be shifted easily to RealNetwork servers when stream requests
exceed those available to us. 
Short: if a page doesn't validate 'on its own' it will never validate
[on the servers it may reside on]. 

I've looked into the option of writing (VB) scripts which would add the
missing charset directive to the 'previously-valid' pages, but the
problem there is that we still have 1000's of archive pages with dirty
html. Writing reliable scripts to create 'valid html' is therefore well
nigh impossible: 'good' and 'bad' are mixed in many directories and tags
and directives in the old pages are highly unpredictable.

Please accept that I am grateful for your suggestions and hints, but we
see no option to solve this situation the way it should, without going
over all the pages we already validated (or thought to have validated
given the 'congratulations'-messages from the W3C Validator earlier on)
I can motivate my team members to do that after all the time they
already devoted to the first pass.

Remains my criticism of W3C: their software validated all these pages
and then, in the next version, had second thoughts. You are perhaps
right about us having it wrong all the time - in light of the small
print in the definitions - but were we wrong to assume that the
standards organisation's webpages would offer us a reliable check of
their own standards?  After all, overlooking a missing directive is not
a bug in the validator software, but a rather serious mis-implementation
of one's own standards - if I take your remark "but the recent changes
to the validator did not make your documents invalid--they already were"
at face value. Because of that (what I consider to be a) glitch we're in
trouble, or have spent weeks of our free time to no avail. That is
frustrating, and I hope it explains the admittedly rather bitchy tone of
my posting. My apologies to anyone in this ng whom I offended in any
way.

 
Yours faithfully

Frank Tiggelaar
Domovina Net




Peter K. Sheerin wrote:
> 
> Frank,
> 
> I too, hate moving goalposts, but the recent changes to the validator did
> not make your documents invalid--they already were.
> 
> (Below you'll find a single line you can add to your server config that
> should fix all your documents, saving you the trouble of changing all the
> pages that you've got the validator logo on.)
> 
> That said, I'm not sure the responses you've received have been quite
> useful, and your initial message was also not helpful. Instead of asking why
> this change was made, and asking for help understanding or fixing the
> problem, you simply came to a conclusion that you should stop validating
> your documents and remove the link to the validator.
> 
> What someone on this list should have done first is to tell you how to
> easily fix the charset problem--as Martin alludes to below. In fact, I
> should have done this, because I had the same problem when I first started
> validating my pages, and thus knew the answer.
> 
> What you should do (wether or not you decide to continue use the validator),
> is to add this line to your Apache config:
> 
> AddDefaultCharset utf-8
> 
> Note that this can usually be done in an .htaccess file, if you don't have
> permission to change the main server configuration file. If you need a
> little finer control, namely the ability to specify different character sets
> for different pages, then you can use something like this:
> 
> AddType text/html;charset=utf-8 .html
> AddType text/html;charset=iso-8859-1 .htm
> AddType text/html;charset=ISO646-DE .de
> 
> This would allow you to have your ".html" pages sent as UTF-8, your ".htm"
> pages sent as ISO Latin-1, and your ".de" pages sent with the ISO German
> character set, for instance.
> 
> > At 05:49 01/12/03 -0500, Frank Tiggelaar wrote:
> > >Over the past year we have taken great care to validate all new pages
> > >and pages on our site which were changed in any way. We added the small
> > >W3C logo to all of the pages we validated. Recently we found out that
> > >none of the pages which validated some time ago are validated today -
> > >suddenly 'character encoding' has become required.
> >
> > >Therefore we stopped validating our pages; we shall remove all 7,000
> > >little W3c-validated logos from our websites.
> >
> > It would have been much easier to add a line or so of directives
> > to your Apache server setup. And that would also have improved
> > worldwide access to and readability of your site.
> >
Received on Monday, 10 December 2001 17:23:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:00 GMT