Re: [Q] Anyone using path info?

From: Terje Bless (link@tss.no)
Date: Tue, Sep 07 1999


Message-Id: <199909070613.IAA14569@vals.intramed.rito.no>
Date: Tue,  7 Sep 1999 07:33:04 +0200
From: Terje Bless <link@tss.no>
To: W3C Validator <www-validator@w3.org>
Subject: Re: [Q] Anyone using path info?

On 05.09.99 at 02:46, Robert Szarka <szarka@downcity.net> wrote:

>At 08:05 AM 9/4/99 , Terje Bless wrote:
>>Does anyone actually use [path info]?
>[...]
>
>By "additional cruft" do you mean something like the following?
>
>http://validator.w3.org/check?uri=http://www.szarka.org/test/xhtml.html
>
>or is there *real* cruft that can go in there I don't know about?
>
>The usage above could/should be replaced with
>
>http://validator.w3.org/check/referer

No. When you fetch the URI to check the referer, you are not giving the CGI
program any parameters. You are however giving extra path info. The CGI
program is "check", the "/referer" bit is interpreteded as an extra part of
the path and is available to the CGI program.

However, "check", as currently written, tries to interpret the extra path
info as if it is identical to any actual parameters (given after the "?" in
the URI above). It will even choose the path info over the CGI parameters
if it is present.

    <URL:http://validator.w3.org/check/uri=localhost> (note "/" vs. "?")

This works fine with the old code base where all you really need to do to
deal with it is a simple conditional. In this code base, all parameter
handling (parsing, splitting, entity en/de-coding, etc.) is handled
internally.

When switching to use CGI.pm, one of the great advantages is that it takes
care of all that for you. And since it's a well tested and widely used
library, most of the bugs have been shaken out of it (as opposed to the
validator code which has had far less review). There are also other reasons
for using CGI.pm (such as abstraction, modularization, code re-use and
maintainability).

The entire problem then lies in the fact that CGI.pm, quite correctly BTW,
does not treat path info as equivalent to the actual parameters.

When I first switched to CGI.pm, I could delete the whole block of code
that dealt with the parameters and replace it with a single line saying "my
$query = new CGI;". Then I needed to add back support for giving parameters
in path info and had to bring back all the old code. In effect I'd _added_
complexity and not taken it away.


Considering the advantages to using CGI.pm and the fact that the path info
support is IMO a "mis-feature", I wondered if anyone was actually using
this or if this support could be dropped offhand. It probably can (it's
broken on w3.org anyway!), but that's the sort of change you need to be a
bit carefull about.


>on that particular page, actually, but I could see how the other approach
>would be useful for automating validation or using a handy list of pages
>to validate...

I'm not sure I understand what you mean. None of the two examples you gave
would be affected by this change (referer handling is a special case that
will be handled).

>Arguably, someone that wants to automate validation should
>probably just run the validator on their own site.

I agree. One of my goals is to make the validator a little easier to run
locally. Of course, another is to make it automatically check multiple URIs
periodically. :-)


>I keep meaning to get around to setting it up for myself and my customers,
>so I guess if you improved the code it might encourage me to do it.  :)

I hope so.

-- 
*** I just switched to a new email client.
*** If you see any format problems in this message, yell. Loudly! :-)

                                             -link