Re: New Draft: draft-ohanlon-transport-info-header from Piers O'Hanlon on 2019-12-05 (ietf-http-wg@w3.org from October to December 2019)

From: Piers O'Hanlon <piers.ohanlon@bbc.co.uk>
Date: Thu, 5 Dec 2019 12:22:42 +0000
To: Patrick McManus <mcmanus@ducksong.com>
CC: Lucas Pardue <lucaspardue.24.7@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <643C0098-5C8A-4068-AB36-9FF587408261@bbc.co.uk>

> On 2 Dec 2019, at 13:50, Patrick McManus <mcmanus@ducksong.com> wrote:
> 
> 
> 
> On Tue, Nov 26, 2019 at 12:12 PM Piers O'Hanlon <piers.ohanlon@bbc.co.uk> wrote:
> 
> Sure - I guess I didn’t put it very well - I was trying to say that the Transport-Info header would be only be transferred between that CDN edge and the client - it’s a hop-by-hop heade'''
> 
> h2 doesn't have hop to hop headers. it has per stream frames though.
>  
Yes I didn’t mention "hop-by-hop" headers in the draft as they aren’t in H2/3, but with many CDN deployments there are still multiple hops between the origin and the client.

> r - for the last last hop.
> 
> how does one know they are the last hop?
> 
The idea is that the header would be deployed by CDNs at the last server entity that peers, at a transport level, with the client. These edge servers facing the client are typically known/configurable by the CDN operators. The server that inserts the metrics starts the parameter list with an ID. In the draft we say that the header should only be inserted by an edge node, but we might need to add that existing Transport-Info headers may be removed, though there was some interest during the HTTPBIS session that it might be useful to retain info on other hops. The client would need to trust that the header was from the last hop - it could perform some sanity checks (e.g. compare with other measurements) to see if the information is useful before relying upon it.

> what have you have described the side channel isn’t clear to me?
> > 
> > 
> > obviously the HTTP protocol is aware of this sharing - indeed it initiates it! but the ramifications of it are generally opaque to the semantics of HTTP exchanges (roughly expressed by headers and messages). So perhaps what you want is better carried in the protocol (e.g. as a frame).. if you are proposing exposing it to content it needs to be scrutinized for the same reason that it is useful. This can be subtle - see CRIME and BREACH for example.
> > 
> Yes I had considered that frame based transport might be better but what is also required is to extract the transport/flow information for that stream - but if one could do that then it would potentially be ok to transport it in a header as the issue is that there is access to shared state.
> 
> there is a significant difference between information available to the protocol implementation and information in the semantic header layer which is available to javascript. The latter is held to a higher level of isolation due to the security model of javascript.
>  
Sure - It’s also down to client type so for a non-browser client then they may have access to the whole stack so they could potentially get the  tcp_info from the client socket and use the TCP metrics directly (that is apart from the server only metric such as snd_cwnd). So the utility of the transport-info header type metrics would be less in that situation, although there are apps with differing levels as of access to network state so frame level info could be useful. I guess RTT can be obtained using PING frames as mentioned in the H2 rfc7540#section-10.8 but the TCP measured (s)RTT would be better, and the server cwnd would be useful.

> Also for H2/3 since this is an issue about transport rate the fact padding is used on each connection
> 
> padding is not really a common thing at scale and there is some controversy over how helpful it is. 
> 
> means that there is always going to be some uncertainty introduced by padding in such attack. Such an attack where one flow to try to learn about another flow it would need to calculate it’s own portion of the flow so it could subtract it from the total flow rate provided by the header but’s going to be limited by not knowing about the padding.
> 
> 
> I think your best argument is going to be that this side channel isn't very meaningful for what you are reporting - but arguing that there isn't a side channel is not going to work out well long term. The attacker is always better motivated and more clever than your defense. Here's a fun one: https://www.usenix.org/node/217606 from ietf 106
Agreed - There are some impressive side channel attacks out there. I wasn't trying to say there is no side channel but just that similar information may be gleaned by other means (e.g. measuring the incoming traffic) that mean that this header shouldn't significantly add the threat model. There is also already shared state between sessions that share an origin (e.g. cookies, and other headers). We will add further details to the security considerations to address this. But I think we need to lay out which metrics we’ll support and probably keep it limited. We could possibly have a way for mechanism to extension but that would need further consideration. 

>  
> Another potential mitigation technique would be to add some noise to the measurements which would make it harder for collusion to occur with coordinated attacks. The choice of the level of noise could be a function of the number of flows the server knows are sharing the connection, or maybe just some randomness so that two simultaneous requests don’t obtain the same result - although in practice they’re not going to obtain the same result as the connection parameters are constantly changing, although such random noise could mitigate attempts to compare trends.
>
Received on Thursday, 5 December 2019 16:56:04 UTC