Problems with content negotiation (was: Re: Preemptive and reactive content negotiation)

Brian Behlendorf:
>On Wed, 6 Sep 1995, Koen Holtman wrote:
>> I think preemptive content negotiation on *all* types a browser
>> supports is too costly already.
>
>Correct - but you don't need to do it for all, just the ones you would 
>prefer to see.  If a browser *can* do HTML 3.0, it says "I can do HTML 
>3.0", the server might come back and say "sorry (406), I only have HTML
>2.0 and a PDF version of this file", so the browser goes "okay, I'll take 
>the HTML 2.0".

Your example does not exactly fit the situation I had in mind, so let
my try to give a more precise scenario.

- My browser supports 100 MIME types, 
   - including text/html and text/x-dvi 
        (.dvi files are handled with an external viewer)
   - but excluding text/x-tex.

- I don't want to send a huge accept header from my browser, so I have
  selected 3 mime types to include.  I will send text/html in the
  Accept header, but not text/x-dvi.

- Server bla.com has, under URL http://bla.com/report, 
    1. a text/x-tex file with quality factor 1
    2. a text/x-dvi file with quality factor 0.9
    3. a text/html file with quality factor 0.2
  (file 3 is a barely adequate version of file 1 made with a
  tex-to-html convertor)

Now, I will naturally want content negotiation to give me file 2, not
file 3, and certainly not 1.  Is there a guarantee that I will
get the correct file, 2, with my accept header not including
text/x-dvi?

In other words, can I construct my Accept header in such a way that
bla.com is quaranteed to give me a 300 or 406 response, so that my
browser can select the text/x-dvi file?

I've done some more reading on the negotiation sections of the spec,
so let me try to answer this question myself.

Say that my browser sends the following Accept header:

 Accept: text/html, image/gif, image/jpeg;q=0.7, */*;q=0.9

Based on my reading of the spec, the server would, with this header in
a GET request for http://bla.com/report, calculate the following
quality table:

  variant I have   quality(Q)
 1  text/x-tex      0.9
 2  text/x-dvi      0.81
 3  text/html       0.2

The spec says (quoting from draft-ietf-http-v10-spec-01.txt):

   The variants with a maximal value for the Q function represent the
   preferred representation(s) of the entity; those with a Q values
   less than the maximal value are therefore excluded from further
   consideration.

This would make text/x-tex, with Q=0.9, the preferred representation,
with text/x-tex and text/html being excluded from consideration.  But
this would make the server send me text/x-tex, and my browser
can't even handle text/x-tex!!

So the spec has a big problem here.  Basically, if I send a `*' in an
accept header, I run the risk of getting a MIME format that is not
supported by my client.

To solve this problem, we can put an extra column in the variants
table:

  variant I have  quality(Q)  supported by client?
 1  text/x-tex     0.9         maybe  (derived from */*)
 2  text/x-dvi     0.81        maybe  (derived from */*)
 3  text/html      0.2         yes    (derived from text/html)

and add a new rule that if the maximum Q has a `maybe', then the
server should send a 300 response (do reactive content negotiation).

Only in a situation

   variant I have         quality(Q)  supported by client?
    text/html               0.5          yes   (derived from text/html)
    application/postscript  0.5          maybe (derived from */*)
    application/msword      0.4          maybe (derived from */*)

where there is a definite `yes' on one of the maximum-Q variants, it
would be allowed for a server to send that variant directly instead of
doing reactive negotiation.

Now, you may say: there is no need to make the above fix to the spec:
if putting `*' in the Accept header causes problems, just never put
`*' in the accept header, use

 Accept: text/html, image/gif, image/jpeg;q=0.7

instead.  But this yields the quality table

  variant I have   quality(Q)
 1  text/x-tex      0.0
 2  text/x-dvi      0.0
 3  text/html       0.2 ,

and the server would send me the crummy text/html version instead of a
300 response allowing my browser to choose the better text/x-dvi.

So *-free accept headers only prevent me from getting something
unsupported, they do not ensure that the negotiation process gives me
the best variant.


A third option would be to send

 Accept: x-this_is_a_mime_type/x-nobody_ever_uses

forcing 406 a response (reactive negotiation) on every URI.  But
reactive negotiation is costly: it requires an extra request-response
round trip.  Doing it on every URI is out of the question.


A fourth option would be to send a huge, *-free, Accept header listing
all 100 mime types my browser supports.  This will ensure that I
always get the best response, but, like the third option, doing this
will have a severe performance impact.


>  The browser should decide where that tradeoff point is 
>for itself - the tradeoff between being conservative in what types you 
>declare you can accept, and the overhead of getting a 406.

Unless I have made a mistake above, It seems that the spec does not
allow any gradual shift in tradeoff points.  You have four points in
the tradeoff space:

- send small Accept with *: risk getting responses with mime types you
  can't handle
- send small Accept without *: risk getting responses with inferior quality
- send small Accept with unused mime type: no risk, but slow because of 
  an extra request-response round trip
- send huge Accept without *: no risk, but slow because of huge
  request message size.

All of these are not very attractive.  This problem can be, and should
be, fixed.

Note that this negotiation problem is orthogonal to the `Accept-*
header bloat' problem discussed in the `Color content negotiation'
thread.

>        Brian

Koen.

Received on Saturday, 9 September 1995 08:25:20 UTC