Re: Date verification in HTML pages

> Is there a credible way of verifying this date or if not could it be
> enforced by the consortium in future HTML versions?

No and no.  

Modification date is actually part of the metadata and is obtained
from the filesystem for local HTML resources and from the HTTP protocol
(and theerfore not a W3C issue) for typical internet fetches.  However,
most pages fetched from commercial sites these days are actually created
on the fly and therefore don't have a modification date, as it would be
the same as the Date: header.

The reasons for creating on the fly tend to be commercial (e.g. defeating
caches to get better access statistics (often self delusion) or changing
and customising advertising on each access) rather than related to the
information payload that the user really wants.  Another factor is that a
convention has developed of not just sending the actual resource required
but also sending navigation and branding information, rather than simply
linking to it.

Many of these could be addressed by more sophisticated use of caching
control parameters and by having server side include and more general
CGI processing synthesize a Last-Modified-Date based on the real content,
but there is very little commercial incentive for webmasters to learn
how to do this.  Any attempt by standards organisations to make this
mandatory will simply be ignored.

For most webmasters, the prime directive is to break most of HTTP 1.1
by frustrating any attempt to cache, so they really have no incentive
to provide correct modification date metadata.

Although this is really an IETF issue, not a W3C one, one could try
to remove the tight coupling with caching by introducing a primary 
content modification date that is separate from the overall page
modification date.  However, especially as, for the supplier, the
primary content is often the advertising, this is unlikely to be
used except by people who are already providing useful modification
date information.

One could also define a metadata profile for including this information
in meta elements, but with the same social engineering problems.

Other reasons for losing modification dates are reloading pages onto
the server when a site is rebuilt and, in at least one case which
had no reason to defeat caching, because the content provider maintained
the site offine and re-FTPed it to the server every week to make updates.

Received on Wednesday, 12 October 2005 06:54:52 UTC