- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Wed, 5 Feb 2020 16:10:30 +0100
- To: "Divakaran, Sudeep" <sudeep.divakaran@intel.com>, "'public-networks-ig@w3.org'" <public-networks-ig@w3.org>
Hi, The minutes of our call today on Lessons from Network Information API WICG are available at: https://www.w3.org/2020/02/05-web-networks-minutes.html copied as text below, and linked from https://www.w3.org/wiki/Networks#Meetings Dom Web & Networks IG: Lessons from Network Information API WICG 05 February 2020 [2]Agenda. [3]IRC log. [2] https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/0003.html [3] https://www.w3.org/2020/02/05-web-networks-irc Attendees Present cpn, Dan_Druta, Dario_Sabella, dom, Doug_Eng, Eric_Siow, Jonas_Svennebring, Jordi_Gimenez, Louay_Bassbouss, Piers_O_Hanlon, sudeep, Tarun_Bansal Regrets - Chair DanD, Song, Sudeep Scribe dom Contents * [4]Meeting minutes Meeting minutes [5]Slides: Network Quality Estimation In Chrome [5] https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/att-0003/Network_Quality_Estimation_in_Chrome.pdf Sudeep: today's session is an important one for the IG … in the past, we've covered a lot about MEC, CDN, network prediction … today we have folks from Google's Chrome team who implemented some APIs around networking … we're glad to have our guest speaker Tarun Bansal from the Chrome Team to give us insights about the APIs implemented in the networking space; how it is used, how useful it is, what lessons to draw from it Tarun: I work on the Google Team and will talk about network quality estimation in Chrome … the talk is divided in 2 parts: use cases, and then technical details about how it works … my focus in the Chrome team is on networking and web page loading … I focus on the tail end of performance, very slow connections e.g. 3G … about 20% of page loads happen on 3G-like connections - which feels very slow, e.g. 20s before first content … videos would also take a lot of buffering time in these circumstances … the 3G share varies from market to market; e.g. 5% in the US, but up to 40% in e.g. developing countries … We have a service that provides continous estimates of network quality, covering RTT and bandwidth … we estimate network quality across all the paths, not specific to a single web servers … this focuses on the common hop from browser to network carrier … [this work got help from lots of folk, esp. Ben, Ilya, Yoav] … Before looking at the use cases, we need to understand how browsers load Web pages and why Web pages load are slow on slow connections … First, it is very challenging to optimize performance of Web pages - takes a lot of resoruces … Web pages typically load plenty of resources before showing any content (e.g. css, js, images, ...) … Not all of these resources are equally important - some have no UX impact (e.g. tracking, below-the-fold content) … loading everything in parallel works fine in fast connection, but in slow connections, it slows everything down … an optimal web page load should keep the network pipe full and should a lower-priority-resource should not slow down a higher-priority resource … e.g. loading a below-the-fold image should not slow down what's needed to show the top of the page … or a JS-for-ad shouldn't slow the core content of the page … this means a browser need to understand the network capacity to optimize loading of resources … this is what led to the creation of this network quality estimation service … Other uses include so called "browser interventions" which are meant to improve the overall quality of the Web by deviating from standard behavior in specific circumstances … in our case, e.g. when a network is very slow … another use case is to feed back to the network stack - e.g. using network timeouts … in the future, this could also be used to set an initial timeout in a smarter way (e.g. higher timeout in poor connection contexts) … lots of use cases for the browser vendor - what use would Web dev make of it? … We've exposed a subset of these values to the developers: RTT estimate, a bandwidth estimate, and a rough-categorization of network quality (in 4 values) … This was released in 2016 … and is being used in around ~20% of web pages across all chrome platforms … examples of usage: … the Shaka player (an open source video player) use the network quality API to adjust the buffer; Facebook does this as well … some developers use it to inform the user that the slow connection will impact the time needed to complete an action … Now looking at the details of the implementation … The first thing we look at is the kind of connection (e.g. wifi) … but that's not enough: there can be slow connections even on Wifi or 4G … a challenge in implementation this API is being able to make it work on all the different platforms which expose very different set of APIs … We also need to make it work on devices as they are, with often very limited access to the network layer … Typically, network quality is estimated by sending echo traffic to a server (e.g. speedtest) … but this isn't going to work for Chrome: privacy (don't want to send data to a server without user intent) … also don't want to maintain a server for this … we also want to make the measurement available to other Chromium-based browsers … so we're using passive estimation … for RTT, we use 3 sources of information based on the platform … the first is the HTTP layer which Chrome controls completely … the 2nd is the transport layer (TCP) for which some platforms provide information … the 3rd is the SPDY/HTTP2 and QUIC/HTTP3 layers … for HTTP, you measure the RTT by the time different between request and response - this is available on all platforms, completely within the Chrome codebase … there are limitations: the server processing time is included in the measurement … for H2 and QUIC connections, the requests are serialized on the same TCP or UDP request, which means the HTTP request can be queued behind other requests … which may inflate the measured RTT … it is mostly useful as an upper bound … for the TCP layer, we look at all the TCP sockets the browser has happened, and ask the kernel what RTT it has computed for these sockets … then we take a median … this is less noisy, but it still has its own limitations … it doesn't take into account packet loss; it doesn't deal with UDP sockets (e.g. if using QUIC) … and it's only available on some platforms - we can't do this on Windows or MacOS … this provides a lower bound RTT estimate … The 3rd source is the QUIC/HTTP2 Ping … Servers are expected to respond immediately to HTTP2 PING … this is available in Chrome, and it removes some of the limitations we discussed earlier … but not all servers support QUIC/H2, esp in some countries … not all servers that support QUIC/H2 support PING despite the spec requirement … and it can still be queued behind other packets … So we have these 3 sources of RTT, we take for each sources all the samples, and we aggregate them with a weighted median … we give more weight to the recent samples; compared to TCP which uses weighted average, we use weighted median to eliminate outliers … once we have these 3 values, we combine them using heuristics to a single value … these heuristics will vary from platform to platform … Is that RTT enough? … We have found that to estimate the real capacity, we need to estimate the bandwidth … there has been a lot of research on this, but none of them worked well for our use case … we do not want to check a server; we want a passive estimate … What are the challenges in estimating bandwidth? The first one is that we don't have cooperation from the server-side … e.g. we don't know what TCP flavor the server is using, we don't know their packet loss rates … so we use a simple approach: we measure how many bytes we get in a given time window with well defined properties (e.g. >128KB large, 5+ active requests) … the goal being to ensure the network is not under-utilized … with all these estimates, how do they quickly adapt to changing network conditions? … e.g. entering in a parking will slow down a 4G connection … we use the strength of the wireless signals … we also store information on well-known networks … To summarize, there are lots of use cases for knowing network quality - not just for browsers, also for Web developers … but there are lots of technical challenges from doing that from the app layer without access to the kernel layer Piers: (BBC) I heard Yoav mention in the IETF that the netinfo RTT exposure might go away for privacy reasons … that was back at the last IETF meeting last year Tarun: it's not clear if we should expose a continuous distribution of RTT - a more granular exposure could work Piers: so this is an ongoing discussion - can you say more about the privacy concerns? Tarun: 2 concerns: one is fingerprinting … we round and add noise to the values to reduce fingerprint … another concern is that a lot of Web developers may not know how to consume continuous values … simplifying it make it easier to consume … we provide this in the Effective Connectivity Type - which can be easier to use to e.g. pick which image to load Piers: we have ongoing work on TransportInfo in IETF that is trying to help with this Tarun: if the server can identify the network quality and send it back to the browser, the browser could it more broadly <piers> [6]https://github.com/bbc/ draft-ohanlon-transport-info-header/blob/master/ draft-ohanlon-transport-info-header.md [6] https://github.com/bbc/draft-ohanlon-transport-info-header/blob/master/draft-ohanlon-transport-info-header.md Piers: one of the use cases is adaptive video streaming; could also useful for small object transports (which are hard to estimate in JS) Tarun: is is mostly for short burst of traffic? Piers: it's also for media as well Tarun: so would the server keep data on typical quality from a given IP address? Piers: it would be sent with a response header (e.g. along with the media) DanD: (AT&T) for IETF QUIC, are you considering using the spin bit that is being specified? Tarun: we're not using it, and I don't think there are plans to use it at the moment … QUIC itself maintains an RTT estimate which we're using Dom: has there been work around network quality prediction - we have a presentation from an Intel team on the topic back in Sep Tarun: not at the moment - we're relying on what the OS provides Jonas: what we're doing for network prediction is to use info coming from the network itself (e.g. load shifting across cells) … we use this to do forward-looking prediction Tarun: the challenge is that this isn't available at the application layer … e.g. they wouldn't be exposed to the Android APIs … an app wouldn't know the tower location - you can know which carrier it is, but not more than that … there is a also a lot variation across Android flavors … the common set is mostly signal strength and carrier identifier Sudeep: would it be interesting for the browser which talks to the browser to talk to interfaces to the carrier network (e.g. via MEC)? … The carrier/operating networks may have more info about the channel conditions Tarun: definitely yes … Android has an API which exposes this information … but it never took off, and most device manufacturers don't support it … there is a way to expose this in Android … I'm not sure what the practical concerns were, but it never took off … it would be super-useful if it was available Sudeep: you spoke about RTT, bandwidth that got defined in W3C … but implementations can vary from one browser to another - is there any standardization about how these would be measured, or would this be UA dependent? Tarun: it's spec as a "best-effort estimate" from the browser, so it's mostly up to the browser … right now it's only available in Chromium-based browsers … even Chromium-based implementations will vary from platform to platform Dom: can you say more about the fact that is is not available in other browsers? Tarun: I think it's a question of priority - we have a lot of users in developing markets which helped drive some of the priority for us Song: (China Mobile) I'm interested in the accuracy of the network quality monitoring … you mention aggregating data from 3 sources: HTTP, TCP and QUIC … is the weights for these 3 sources fixed, or does it vary based on the scenario? Tarun: it's very hard to measure accuracy … in lab studies (with controlled network conditions), the accuracy algorithm does quite well … we also do A/B studies, but it's hard given we don't really know the ground truth … so we measure the behavior of the consumer of the API, e.g. on the overall page load performance … we've seen 10-15% improvements when tuning the algorithm the right way Song: when you measure the data from these 3 sources, are they exposed to the Web Dev? or only the aggregated value? … are there any chance to make the raw source data available to Web browsers? Tarun: we only provide aggregated values Piers: how often do you update the value? Tarun: internally, everytime we send or receive a packet … we throttle it on the Web API - when the values have changed by more than 10% Piers: that's a pretty large margin for adaptation Tarun: most of the developers don't care about very precise estimates … it's pretty hard to write pages that takes into account that kind of continuous change Piers: for media, more details are useful Tarun: even then, you usually only have 2 or 3 resolutions to adopt to Piers: but the timing of the adaptation might be sensitive Piers: Any plans to provide more network info? Tarun: no other plans as of now … we're open to it if there are other useful bits to expose Sudeep: that's one of the topics the group is aiming to build on … are there other APIs in this space that you think would be useful to Web developers? Tarun: I think most developers care about few different values … it's not clear they would use very detailed info … another challenge we see is around caching (e.g. different network resources for different network quality) … you might be loading new resources because you're on a different network quality, which if it is of low quality isn't counter productive … In general, server-side estimates are likely more accurate Sudeep: Thank you Tarun for a very good presentation! … Going forward, we want to look at how these APIs can and need to be improved based on Web developers needs … we'll follow up with a discussion … Next week we have a presentation by Michael McCool on Edge computing - how to offload computing from a browser to the edge using Web Workers et al … call info will be sent to the list
Received on Wednesday, 5 February 2020 15:10:38 UTC