- From: Mark Nottingham <mnot@mnot.net>
- Date: Fri, 17 Sep 1999 01:45:50 +1000
- To: Julien Pierre <jpierre@netscape.com>
- Cc: http-wg@hplb.hpl.hp.com
I was just paging through previous e-mail to prepare draft-01, and came across this, which I don't think I responded to. [me] > >If you have a PDF file that happens to be generated by a CGI script, it's a huge pain to get it to support ranges, validation and > >other facilities in the script itself. I've done it, and it was not pleasant. [JP] > What you suggest when requesting the server implementor to do this job is that the server has to regenerate the whole dynamic page just to get a few bytes. This > may involve rerunning the application. Even if your CGI sends exactly the same-content with the same last-modified date, so that the client still thinks it's the > same content between HTTP requests, the PDF plug-in might make 10 different HTTP requests in a row with various ranges. The server executes the script 10 times, > gets the full PDF content from the script, and then only sends the requested bytes. Technically, it works, but it's extremely inefficient. > On the other hand, if you let the CGI itself handle the range request, then it's not so bad : the CGI will try to generate those requested bytes itself and won't > waste memory or CPU trying to generate and send the entire thing. I agree to some extent. I need to make it more clear that the draft is about defaults; in this case, if you have a more efficient way of handling range requests in the CGI, have it generate a 'Accept-Ranges: bytes' header, and the server should know that the application is capable of dealing with partial content. Best of both worlds. Even if this isn't taken advantage of, IMHO it's much easier to scale a server (hardware) than the complete network between any possible user and the server. Regenerating the entire object to send a few bytes is inefficient on the server, but it's better than leaving it in the publisher's hands; as in most cases it won't be done at all. I'd very much like a survey to be done of Webmasters, CGI scripters, etc., to ask them where they think the responsibilty for handling these features currently lies. > Ranges work best with direct access content (eg: static files) or with cacheable content ; with dynamic content, the server typically only has sequential access to > the content and does not cache it. I know what you're getting at, but I really want to redefine what people think of as dynamic content, as well as cacheability. Dynamic content to me is defined by dependence on either the identity (however derived) of the current user, or some other external source of content entropy that causes two hits to the same request at the same time to generate different content. That's it; it has nothing to do with the presence of the string cgi-bin, a query or anything else. In my mind cachability is two very separate properties; assigned TTL (through Expires, Cache-Control or assumed through LM) and ability to validate through conditional request. In the future, it may expand to include such methods as ability to delta encode. > That's not to say the HTTP server shouldn't do anything. I think it makes sense for it to transparently do chunking on CGI and plug-ins output, and the way we are > doing it now works very nicely (though I wish we could use our own browser to test this feature :)). *grin* know that feeling... -- Mark Nottingham
Received on Saturday, 18 September 1999 11:31:27 UTC