Re: Server-side roles in the HTTP from Julien Pierre on 1999-09-01 (ietf-http-wg@w3.org from July to September 1999)

From: Julien Pierre <jpierre@netscape.com>
Date: Wed, 01 Sep 1999 15:38:59 -0700
To: Mark Nottingham <mnot@mnot.net>
CC: http-wg@hplb.hpl.hp.com
Message-ID: <37CDAB03.A9DC323E@netscape.com>
Hi,

Mark Nottingham wrote:

> Good point; I'm fairly ignorant of the inner workings of servlets as well as their newer cousins like JSP. They very well may fall out of scope for this
> proposal. However, I think that would be a shame; even people with the knowledge to use these facilities rarely implement anything beyond the very
> basics of the protocol.
>
> There's nothing in the draft (although this isn't articulated well) that says mechanisms can't be established to override these server-level responsiblities.

When you have MUST requirements it seems to somewhat preclude them.


> What I'm trying to do is give the people I called 'publishers' the assurance that, if they use a server and generator compliant with this draft, they won't
> need to worry about details of protocol implementation -- not that they do (that's the root of the problem).

> Whether the server or the 'content generators' is the best place to do that is an interesting question; my feeling is that it's simpler and more reliable to do
> it in the server.

It's true that most applications that generate content don't deal with those 1.1 features like range requests.
It makes sense to have a server mechanism for dealing with them transparently.

But it should still be possible to defer the responsability to handle these features to the application, if only to maintain compatibility with current
applications. Sometimes it's actually the intended effect - you may want the application to actually respond differently to byteranges or other types of ranges, for
instance, than what the server implementor might expect.

> It does become a nightmare if it is applied to everything that can be tied into a server. It may be prudent to draw a line at APIs, or perhaps there needs to
> be a further distinction in the draft -
> * content generators (CGI, PHP, ASP, etc) - have a 'shallow', high-level view into the server, server is required to handle all protocol features (as in current
> draft)
> * APIs et al (Servlets, ISAPI, NSAPI, etc) - low-level 'deep' view, api user has library of protocol feature calls available
>
> Would this be more practical?

Yes, I think it makes sense to make this distinction.

> > As far as your draft, I think there should be fewer "MUST" requirements.
>
> It depends on how you place the draft. If you think of it as a peer document to 2616 (you must be compliant with this draft to be considered HTTP
> compliant), you're right - it's far too strict. However, if you consider it as an additional, higher level of compliance, the MUSTS are important.

Only if they truly apply to all resources in a generic way and I don't think there was any in your draft that did.

> I do think that some of the SHOULDs (especially to do with buffering and *perhaps* synthetic validation) might be better as MAYs.

Actually, the buffering is one of the main advantages of 1.1 and I believe it should be a SHOULD. But I agree about the synthetic validation being a MAY.

> > For instance, the partial content requirement won't necessarily fly with all
> > type of content generators. According to RFC2616 section 14.5, a server is not
> > required to serve ranges for all resources - it can send "Accept-range: None". That
> > doesn't preclude it from supporting ranges on other resources, such as
> > static files. I like this flexibility.
>
> But from a protocol/network/user perspective, does it make any sense that a feature is available for one resource but not another, based on the
> underlying technology?

The problem is that the underlying server technology is not monolithic. There isn't a clear line between the daemon and what you call content generators.
If you look at the way that large web sites are implemented today, it can be pretty complex. You always have an HTTP daemon on the front-end, then possibly a
plug-in running in the daemon process, which in turn may communicate with an application server on a remote machine, itself accessing corporate data on other
servers.

In such a case, I would argue that the content generator is the application server. But the HTTP daemon cannot handle all aspects of the protocol related to such a
complex environment ; instead the plug-in would have to deal with some of them.

>If you have a PDF file that happens to be generated by a CGI script, it's a huge pain to get it to support ranges, validation and
>other facilities in the script itself. I've done it, and it was not pleasant.

What you suggest when requesting the server implementor to do this job is that  the server has to regenerate the whole dynamic page just to get a few bytes. This
may involve rerunning the application. Even if your CGI sends exactly the same-content with the same last-modified date, so that the client still thinks it's the
same content between HTTP requests, the PDF plug-in might make 10 different HTTP requests in a row with various ranges. The server executes the script 10 times,
gets the full PDF content from the script, and then only sends the requested bytes. Technically, it works, but it's extremely inefficient.
On the other hand, if you let the CGI itself handle the range request, then it's not so bad : the CGI will try to generate those requested bytes itself and won't
waste memory or CPU trying to generate and send the entire thing.

Ranges work best with direct access content (eg: static files) or with cacheable content ; with dynamic content, the server typically only has sequential access to
the content and does not cache it.

That's not to say the HTTP server shouldn't do anything. I think it makes sense for it to transparently do chunking on CGI and plug-ins output, and the way we are
doing it now works very nicely (though I wish we could use our own browser to test this feature :)).

> A lot of the feedback that I've gotten has been in this vein - that not every object is going to take advantage of all of these 'extra' services. But, if they
> aren't available by default, chances are they'll never be thought of, much less handled.

Some features are the responsibility of the application, not all are the server's.

> The other way that I'm working on attacking this problem is creating a series of wrapper scripts / libraries that will take care of as much of this stuff as
> possible automatically. Unfortunately, it's very much an uphill battle (for instance, if I include a content-length in Netscape or Apache CGI, it will make the
> connection persistent; no such luck with other servers), so this would only be a temporary measure.

This won't be an issue in the long term at least for NES servers; with our post-4.0 release, you won't need to do any of this to get the full benefits of persistent
connections and chunking.

--
for a good time, try kill -9 -1
Received on Saturday, 4 September 1999 14:01:12 UTC