Re: pass URL instead of filename? from iain truskett on 1999-05-13 (html-tidy@w3.org from April to June 1999)

From: iain truskett <koschei@eh.org>
Date: Fri, 14 May 1999 05:45:21 +1000
To: html-tidy@w3.org
Message-ID: <b787b60149.ict@brucehall79.anu.edu.au>

Fred Leeflang <fredl@dutchie.org> wrote on 14 May:
[...]
> Why not have html-tidy access a URL instead of a file? This will make
> sure that the webserver evaluates all SSI stuff (and asp stuff, and
> other stuff) and checks the *result* that the webserver will give to
> the browser.

Oddly enough, the port to Acorn RISC OS of tidy does actually do this
(albeit in a highly platform specific manner). what it basically does
is replace the various fopen, fgets, fclose etc functions with its own
set which check if the file given is of the form
"url:http://www.some.site/blah.blah" (ie. url with "url:" at the start)
and if they aren't it calls the standard C functions, otherwise it
calls its own stuff for getting URL data in and out.

The sources to that particular port, which may or may not be of use for
ideas of implementing it on other platforms, will be available from my
website around Sunday (after I've tidied bits of it up).


You could, of course, always do (if using unix/linux):

[ict@sable]:# lynx -source http://koschei.shada.com/ | tidy

which would mean that lynx worries about which URL method to use and
all that guff. it keeps the tidy binaries as simple as possible and
calls in the more specialised program to deal with what its good at.
rule #1 of programming: keep it simple.
rule #2: if something else can do it, let it. =)


there's also a program known as curl which makes obtaining webpages
easy (and has ports to various unix/linux systems and also win32). It
should be theoretically possible to patch it into tidy, if so desired.

It's available from <http://www.fts.frontec.se/~dast/curl/>


cheers,

-- 
iain, aka koschei                           <http://koschei.shada.com/>
Famous last RPG words, number 502 -
     "Don't unplug it, it will just take a moment to fix."

Received on Thursday, 13 May 1999 15:45:45 UTC