[minutes] February 5 Teleconf

Hi,

The minutes of our call today on Lessons from Network Information API
WICG are available at:
  https://www.w3.org/2020/02/05-web-networks-minutes.html

copied as text below, and linked from
https://www.w3.org/wiki/Networks#Meetings

Dom

   Web & Networks IG: Lessons from Network Information API WICG


05 February 2020

   [2]Agenda. [3]IRC log.

      [2]
https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/0003.html
      [3] https://www.w3.org/2020/02/05-web-networks-irc

Attendees

   Present
          cpn, Dan_Druta, Dario_Sabella, dom, Doug_Eng, Eric_Siow,
          Jonas_Svennebring, Jordi_Gimenez, Louay_Bassbouss,
          Piers_O_Hanlon, sudeep, Tarun_Bansal

   Regrets
          -

   Chair
          DanD, Song, Sudeep

   Scribe
          dom

Contents

     * [4]Meeting minutes

Meeting minutes

   [5]Slides: Network Quality Estimation In Chrome

      [5]
https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/att-0003/Network_Quality_Estimation_in_Chrome.pdf

   Sudeep: today's session is an important one for the IG
   … in the past, we've covered a lot about MEC, CDN, network
   prediction
   … today we have folks from Google's Chrome team who implemented
   some APIs around networking
   … we're glad to have our guest speaker Tarun Bansal from the
   Chrome Team to give us insights about the APIs implemented in
   the networking space; how it is used, how useful it is, what
   lessons to draw from it

   Tarun: I work on the Google Team and will talk about network
   quality estimation in Chrome
   … the talk is divided in 2 parts: use cases, and then technical
   details about how it works
   … my focus in the Chrome team is on networking and web page
   loading
   … I focus on the tail end of performance, very slow connections
   e.g. 3G
   … about 20% of page loads happen on 3G-like connections - which
   feels very slow, e.g. 20s before first content
   … videos would also take a lot of buffering time in these
   circumstances
   … the 3G share varies from market to market; e.g. 5% in the US,
   but up to 40% in e.g. developing countries
   … We have a service that provides continous estimates of
   network quality, covering RTT and bandwidth
   … we estimate network quality across all the paths, not
   specific to a single web servers
   … this focuses on the common hop from browser to network
   carrier
   … [this work got help from lots of folk, esp. Ben, Ilya, Yoav]
   … Before looking at the use cases, we need to understand how
   browsers load Web pages and why Web pages load are slow on slow
   connections
   … First, it is very challenging to optimize performance of Web
   pages - takes a lot of resoruces
   … Web pages typically load plenty of resources before showing
   any content (e.g. css, js, images, ...)
   … Not all of these resources are equally important - some have
   no UX impact (e.g. tracking, below-the-fold content)
   … loading everything in parallel works fine in fast connection,
   but in slow connections, it slows everything down
   … an optimal web page load should keep the network pipe full
   and should a lower-priority-resource should not slow down a
   higher-priority resource
   … e.g. loading a below-the-fold image should not slow down
   what's needed to show the top of the page
   … or a JS-for-ad shouldn't slow the core content of the page
   … this means a browser need to understand the network capacity
   to optimize loading of resources
   … this is what led to the creation of this network quality
   estimation service
   … Other uses include so called "browser interventions" which
   are meant to improve the overall quality of the Web by
   deviating from standard behavior in specific circumstances
   … in our case, e.g. when a network is very slow
   … another use case is to feed back to the network stack - e.g.
   using network timeouts
   … in the future, this could also be used to set an initial
   timeout in a smarter way (e.g. higher timeout in poor
   connection contexts)
   … lots of use cases for the browser vendor - what use would Web
   dev make of it?
   … We've exposed a subset of these values to the developers: RTT
   estimate, a bandwidth estimate, and a rough-categorization of
   network quality (in 4 values)
   … This was released in 2016
   … and is being used in around ~20% of web pages across all
   chrome platforms
   … examples of usage:
   … the Shaka player (an open source video player) use the
   network quality API to adjust the buffer; Facebook does this as
   well
   … some developers use it to inform the user that the slow
   connection will impact the time needed to complete an action
   … Now looking at the details of the implementation
   … The first thing we look at is the kind of connection (e.g.
   wifi)
   … but that's not enough: there can be slow connections even on
   Wifi or 4G
   … a challenge in implementation this API is being able to make
   it work on all the different platforms which expose very
   different set of APIs
   … We also need to make it work on devices as they are, with
   often very limited access to the network layer
   … Typically, network quality is estimated by sending echo
   traffic to a server (e.g. speedtest)
   … but this isn't going to work for Chrome: privacy (don't want
   to send data to a server without user intent)
   … also don't want to maintain a server for this
   … we also want to make the measurement available to other
   Chromium-based browsers
   … so we're using passive estimation
   … for RTT, we use 3 sources of information based on the
   platform
   … the first is the HTTP layer which Chrome controls completely
   … the 2nd is the transport layer (TCP) for which some platforms
   provide information
   … the 3rd is the SPDY/HTTP2 and QUIC/HTTP3 layers
   … for HTTP, you measure the RTT by the time different between
   request and response - this is available on all platforms,
   completely within the Chrome codebase
   … there are limitations: the server processing time is included
   in the measurement
   … for H2 and QUIC connections, the requests are serialized on
   the same TCP or UDP request, which means the HTTP request can
   be queued behind other requests
   … which may inflate the measured RTT
   … it is mostly useful as an upper bound
   … for the TCP layer, we look at all the TCP sockets the browser
   has happened, and ask the kernel what RTT it has computed for
   these sockets
   … then we take a median
   … this is less noisy, but it still has its own limitations
   … it doesn't take into account packet loss; it doesn't deal
   with UDP sockets (e.g. if using QUIC)
   … and it's only available on some platforms - we can't do this
   on Windows or MacOS
   … this provides a lower bound RTT estimate
   … The 3rd source is the QUIC/HTTP2 Ping
   … Servers are expected to respond immediately to HTTP2 PING
   … this is available in Chrome, and it removes some of the
   limitations we discussed earlier
   … but not all servers support QUIC/H2, esp in some countries
   … not all servers that support QUIC/H2 support PING despite the
   spec requirement
   … and it can still be queued behind other packets
   … So we have these 3 sources of RTT, we take for each sources
   all the samples, and we aggregate them with a weighted median
   … we give more weight to the recent samples; compared to TCP
   which uses weighted average, we use weighted median to
   eliminate outliers
   … once we have these 3 values, we combine them using heuristics
   to a single value
   … these heuristics will vary from platform to platform
   … Is that RTT enough?
   … We have found that to estimate the real capacity, we need to
   estimate the bandwidth
   … there has been a lot of research on this, but none of them
   worked well for our use case
   … we do not want to check a server; we want a passive estimate
   … What are the challenges in estimating bandwidth? The first
   one is that we don't have cooperation from the server-side
   … e.g. we don't know what TCP flavor the server is using, we
   don't know their packet loss rates
   … so we use a simple approach: we measure how many bytes we get
   in a given time window with well defined properties (e.g.
   >128KB large, 5+ active requests)
   … the goal being to ensure the network is not under-utilized
   … with all these estimates, how do they quickly adapt to
   changing network conditions?
   … e.g. entering in a parking will slow down a 4G connection
   … we use the strength of the wireless signals
   … we also store information on well-known networks
   … To summarize, there are lots of use cases for knowing network
   quality - not just for browsers, also for Web developers
   … but there are lots of technical challenges from doing that
   from the app layer without access to the kernel layer

   Piers: (BBC) I heard Yoav mention in the IETF that the netinfo
   RTT exposure might go away for privacy reasons
   … that was back at the last IETF meeting last year

   Tarun: it's not clear if we should expose a continuous
   distribution of RTT - a more granular exposure could work

   Piers: so this is an ongoing discussion - can you say more
   about the privacy concerns?

   Tarun: 2 concerns: one is fingerprinting
   … we round and add noise to the values to reduce fingerprint
   … another concern is that a lot of Web developers may not know
   how to consume continuous values
   … simplifying it make it easier to consume
   … we provide this in the Effective Connectivity Type - which
   can be easier to use to e.g. pick which image to load

   Piers: we have ongoing work on TransportInfo in IETF that is
   trying to help with this

   Tarun: if the server can identify the network quality and send
   it back to the browser, the browser could it more broadly

   <piers> [6]https://github.com/bbc/
   draft-ohanlon-transport-info-header/blob/master/
   draft-ohanlon-transport-info-header.md

      [6]
https://github.com/bbc/draft-ohanlon-transport-info-header/blob/master/draft-ohanlon-transport-info-header.md

   Piers: one of the use cases is adaptive video streaming; could
   also useful for small object transports (which are hard to
   estimate in JS)

   Tarun: is is mostly for short burst of traffic?

   Piers: it's also for media as well

   Tarun: so would the server keep data on typical quality from a
   given IP address?

   Piers: it would be sent with a response header (e.g. along with
   the media)

   DanD: (AT&T) for IETF QUIC, are you considering using the spin
   bit that is being specified?

   Tarun: we're not using it, and I don't think there are plans to
   use it at the moment
   … QUIC itself maintains an RTT estimate which we're using

   Dom: has there been work around network quality prediction - we
   have a presentation from an Intel team on the topic back in Sep

   Tarun: not at the moment - we're relying on what the OS
   provides

   Jonas: what we're doing for network prediction is to use info
   coming from the network itself (e.g. load shifting across
   cells)
   … we use this to do forward-looking prediction

   Tarun: the challenge is that this isn't available at the
   application layer
   … e.g. they wouldn't be exposed to the Android APIs
   … an app wouldn't know the tower location - you can know which
   carrier it is, but not more than that
   … there is a also a lot variation across Android flavors
   … the common set is mostly signal strength and carrier
   identifier

   Sudeep: would it be interesting for the browser which talks to
   the browser to talk to interfaces to the carrier network (e.g.
   via MEC)?
   … The carrier/operating networks may have more info about the
   channel conditions

   Tarun: definitely yes
   … Android has an API which exposes this information
   … but it never took off, and most device manufacturers don't
   support it
   … there is a way to expose this in Android
   … I'm not sure what the practical concerns were, but it never
   took off
   … it would be super-useful if it was available

   Sudeep: you spoke about RTT, bandwidth that got defined in W3C
   … but implementations can vary from one browser to another - is
   there any standardization about how these would be measured, or
   would this be UA dependent?

   Tarun: it's spec as a "best-effort estimate" from the browser,
   so it's mostly up to the browser
   … right now it's only available in Chromium-based browsers
   … even Chromium-based implementations will vary from platform
   to platform

   Dom: can you say more about the fact that is is not available
   in other browsers?

   Tarun: I think it's a question of priority - we have a lot of
   users in developing markets which helped drive some of the
   priority for us

   Song: (China Mobile) I'm interested in the accuracy of the
   network quality monitoring
   … you mention aggregating data from 3 sources: HTTP, TCP and
   QUIC
   … is the weights for these 3 sources fixed, or does it vary
   based on the scenario?

   Tarun: it's very hard to measure accuracy
   … in lab studies (with controlled network conditions), the
   accuracy algorithm does quite well
   … we also do A/B studies, but it's hard given we don't really
   know the ground truth
   … so we measure the behavior of the consumer of the API, e.g.
   on the overall page load performance
   … we've seen 10-15% improvements when tuning the algorithm the
   right way

   Song: when you measure the data from these 3 sources, are they
   exposed to the Web Dev? or only the aggregated value?
   … are there any chance to make the raw source data available to
   Web browsers?

   Tarun: we only provide aggregated values

   Piers: how often do you update the value?

   Tarun: internally, everytime we send or receive a packet
   … we throttle it on the Web API - when the values have changed
   by more than 10%

   Piers: that's a pretty large margin for adaptation

   Tarun: most of the developers don't care about very precise
   estimates
   … it's pretty hard to write pages that takes into account that
   kind of continuous change

   Piers: for media, more details are useful

   Tarun: even then, you usually only have 2 or 3 resolutions to
   adopt to

   Piers: but the timing of the adaptation might be sensitive

   Piers: Any plans to provide more network info?

   Tarun: no other plans as of now
   … we're open to it if there are other useful bits to expose

   Sudeep: that's one of the topics the group is aiming to build
   on
   … are there other APIs in this space that you think would be
   useful to Web developers?

   Tarun: I think most developers care about few different values
   … it's not clear they would use very detailed info
   … another challenge we see is around caching (e.g. different
   network resources for different network quality)
   … you might be loading new resources because you're on a
   different network quality, which if it is of low quality isn't
   counter productive
   … In general, server-side estimates are likely more accurate

   Sudeep: Thank you Tarun for a very good presentation!
   … Going forward, we want to look at how these APIs can and need
   to be improved based on Web developers needs
   … we'll follow up with a discussion
   … Next week we have a presentation by Michael McCool on Edge
   computing - how to offload computing from a browser to the edge
   using Web Workers et al
   … call info will be sent to the list

Received on Wednesday, 5 February 2020 15:10:38 UTC