W3C home > Mailing lists > Public > www-qa@w3.org > October 2005

Re: Question about HTTP redirects

From: olivier Thereaux <ot@w3.org>
Date: Thu, 20 Oct 2005 08:44:44 +0900
Message-Id: <6453c4142542131da3282edcd925a4b9@w3.org>
Cc: "'www-qa@w3.org'" <www-qa@w3.org>
To: "Watson, Robert C" <Robert.C.Watson@scottforesman.com>

Hello Robert,

Lots of ideas in your mail, and a great basis for discussion, thank you 
for sending this in.

On Oct 20, 2005, at 6:05, Watson, Robert C wrote:
> My question is, how can sites with hundreds of changing URLs (read:
> "Marketing making them up") be manageable using HTTP redirects?

This is the first thing I'd like to react to, and I think it's really 
the crux of the issue.

I worked in a factory before, and I am sure that if marketing people 
had come to the factory saying "Guys guys, your product reference 
numbers are so un-cool. Didn't you hear? odd numbers are the new black. 
You really have to change all the reference codes for our products!", 
they would have been laughed at, at kicked out. Why? Because product 
reference codes are in the domain of production and logistics, and that 
is off-limit for marketing. So if the marketing department decides to 
rename the "iNod pipo 1.8G yellow" into "iAmSoCool 2G gold", no 
problem, but as a reference code, that product is, and will always 
remain "MA099J/A".

Same goes for Web pages. The title, text, design can change, but there 
aren't a lot of good reason to change a URI.

Of course the Web is a rather new discipline, tied to information and 
communication, and it's logically often attached to a marketing 
department. But some aspects of the web management are more about 
information management and logistics than PR, and as such, they should 
be off-limits.

> The page http://www.w3.org/QA/Tips/reback recommends using an HTTP 
> redirect.
> However, we have found it necessary to provide "friendly" redirection.
> Here's what we do:
>
> User encounters missing page
> Server throws 404 status, which is set to go to a custom 404.cfm page
> 404.cfm gets referrer URL from CGI variables and forwards to friendly
> support page using HTTP meta refresh tag (can't use javascripts 
> because of
> popup blockers and an annoying XP SP2 bug)
> Support page uses referrer URL as a key to look up the replacement URL 
> for
> the missing content
> Support page forwards the user to the correct page

There are good ideas in here, and others that I do not understand.

* the "forwarding" via meta tag is really unnecessary here. Indeed, 
javascript isn't a good idea for a redirect, but every single decent 
web library I know (e.g header() in php, etc) has a way to do a 
standard redirect, without a need for meta refresh.

* I think the idea of making the 404s be handled by a script that tries 
some analysis and heuristics, and then suggests "maybe you are looking 
for this or that" is a great idea.

* ... however, using the "smart" 404 to take care of redirects from an 
old URI to a new one is overkill and unnecessary.
> It seems to me that the only sane solution is a relational database 
> that a human
> (non-technical admin) can edit through a web interface.

  It doesn't have to be relational (the mapping is really trivial, one 
way, so even a plain text file would do), but making a database with 
the mapping of old uris to new ones is a good way! However there's no 
need to go through a 404 and several forwards, couldn't it be simpler 
to make the mapping understandable by the Web server and let it handle 
the redirects.

See for instance the "rewritemaps" in apache
http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewritemap
Which can even be made very fast and efficient with dbm hash maps
http://www.websiteoptimization.com/speed/tweak/abbreviate/
I assume it would be trivial to make a script as a front-end to edit 
the hash map.

Don't want to feed your redirects map to apache? No problem.
Assuming you already set up the smart 404 handler script, make it act 
like this:
* check queried URI
   * if it matched an entry in the redirection map, send a 301, 302 or 
307 HTTP code and redirect to the new URI
   * if not, send a 404 HTTP code, and try some analysis (spell 
checking, etc) to suggest potential candidates in the error page body

That's it. No javascript or html meta refresh funkiness, just clean, 
straightforward HTTP.

Thanks
-- 
olivier
Received on Wednesday, 19 October 2005 23:44:59 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:40:36 UTC