Re: Cache control in trailers?

There are special problems with scanning for viruses.  Truncating is not 
reliably safe.  This is because truncated content may still contain 
enough data to be dangerous, and browsers don't delete downloads on 
truncation (none of them did last time I checked anyway).  So if you 
want your virus to go through a scanning intermediary, just pad it out 
first so the payload after truncation is still virulent.

Browsers probably should fix their treatment of truncated content.  E.g. 
close on connection during chunked download without 0\r\n\r\n should 
raise some kind of flag?  Same with connection close with C-L prior to 
advertised number of bytes arriving.  The pathological case is the 
HTTP/1.0 mode with no C-L and there's no way to indicate an abort.

Timeouts are the other big problem.  Client agents tend to time out if 
they send a request, and don't get a response in a timeframe they deem 
reasonable.  Scanning intermediaries have to pass something back through 
to prevent clients just retrying.  Spooling and scanning can take a LONG 
time, e.g. large archives, which need unpacking and scanning of 
individual contained files.  This was the impetus behind my I-D for 
using 1xx codes to indicate scanning progress.

There hasn't been really any support on this working group for properly 
supporting applications like content inspection.  HTTP really doesn't 
lend itself to this.  It could of course.  Some kind of directive in 
trailers could be an interesting approach, but as you say would require 
the client to defer processing until the final directive.

I'd suggest using another 2xx code to indicate that processing should be 
deferred.  And support for this would need to be indicated by the 
client.


Adrien

------ Original Message ------
From: "Wesley Oliver" <wesley.olis@gmail.com>
To: "Adrien de Croy" <adrien@qbik.com>
Cc: "Roy T. Fielding" <fielding@gbiv.com>; "HTTP Working Group" 
<ietf-http-wg@w3.org>
Sent: 6/02/2021 9:52:35 pm
Subject: Re: Cache control in trailers?

>Hi,
>
>Be keen to get through previous list specs, just load shedding for next 
>few hours, so I'll have wait patiently.
>
>For virus, I would imagion body would be truncated, signal fail, as no 
>wants a virus to be saved. May be good to implement a different status 
>error code here or class of codes, for programs that snoop the traffic, 
>so one can easily differentiate them. Mabye that they like old Norton 
>did inject js, inject an identifier, which means for error page or 
>anyone to better debug what's gone wrong and where.
>Recommend ways handling the contentz based on that status code.
>
>Waiting for the final status, means that pages would be blocking the 
>client wouldn't be able to process the page and start down loading 
>content, depending on the definition. Because anyone choose not obey 
>starting to process the body. I feel that depending on use cases may 
>like to indicate different behaviours.
>
>Allow body to be processed incremental, just not rendered, one option, 
>another progressive loaded pages, one want to be rendered and on error, 
>two ways to handle it, one continue display current page in progressive 
>non complete form and like popuo page fail, OK to the error page, or 
>mabye browse basically, show split frame or popup error, so error page 
>loads and displays as an overlay, may want top-left top-right 
>bottom-left bottom-right.
>The other split, like resizable frame set window, which could be 
>closed.
>Like a retry button or auto retry be attempted.
>
>Gather the behaviour in which body should be handled needs have 
>different http response header modes, for like the above behaviours. 
>Which can evolved over time, as people think new ways to do things.
>
>More important part would be ensure caching, drops the body, and ensure 
>it knows that content needs to be revalidated. So whether communicate 
>for cache if body content is kept, so it can display what every it had 
>while revalidstinh if wanted. I am thinking
>Way in which cache gets handled, need be communicated in error, set 
>cache status to must-revalidate, or mabye adjusted shorter failed cache 
>time.
>
>Mabye on there being signal of failure the behaviour of what is to 
>happen to the cache, should be left programmable by a response header, 
>which decide mabye under crazy load what every that must continue fail 
>for 5minutes, instead like the normal 1hour cache or mabye a few weeks. 
>I think allowing http response header, to spesified cache content 
>expiry behaviour override, may be alot more flexible in future as ways 
>in which we use and improve things and instead of pining it down to 
>must revalidate, think having a default behaviour of must revalidate, 
>if now http response header for onerror or httpstatus code map cache 
>behaviours.
>Feel a default behaviour and ability to still override it dynamically, 
>which come with time new a alternative use cases.
>
>Just my perspective on things.
>
>Kind Regards,
>
>Wesley Oliver
>
>On Fri, 05 Feb 2021, 23:16 Adrien de Croy, <adrien@qbik.com> wrote:
>>
>>
>>------ Original Message ------
>>From: "Roy T. Fielding" <fielding@gbiv.com>
>>
>> >Personally, I think end-status is the easiest and most reusable 
>>solution, for
>> >any number of features that might need to know if something broke. 
>>However,
>> >Willy is right that saying must-revalidate up front and then 
>>softening that
>> >at the end would be the safer choice where completion is more 
>>important
>> >than default performance. I suggest that choice needs to be 
>>resource-specific.
>> >
>> >
>>The "Something broke" could also apply to a scanning intermediary.  
>>E.g.
>>some message body was found to contain a virus.
>>
>>If an intermediary could signal that the final status may be 
>>different,
>>and rely on clients to obey that, then it could safely stream 
>>unscanned
>>data to the client, and indicate the result at the end, knowing the
>>client will discard the message body.
>>
>>This would avoid all manner of hacks which don't work very well but 
>>are
>>required to stop client agents from e.g. timing out due to lack of 
>>data
>>(as intermediary spools and scans).
>>
>>It might be necessary if defining a new field which indicates a final
>>result, to indicate to authors that it might be a good idea not to
>>render a message body until the final status is received (to avoid
>>susceptibility to exploits).
>>
>>Adrien
>>
>>
>>
>> >
>>
>>

Received on Tuesday, 9 February 2021 22:59:45 UTC