- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Wed, 5 Feb 2020 16:10:30 +0100
- To: "Divakaran, Sudeep" <sudeep.divakaran@intel.com>, "'public-networks-ig@w3.org'" <public-networks-ig@w3.org>
Hi,
The minutes of our call today on Lessons from Network Information API
WICG are available at:
https://www.w3.org/2020/02/05-web-networks-minutes.html
copied as text below, and linked from
https://www.w3.org/wiki/Networks#Meetings
Dom
Web & Networks IG: Lessons from Network Information API WICG
05 February 2020
[2]Agenda. [3]IRC log.
[2]
https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/0003.html
[3] https://www.w3.org/2020/02/05-web-networks-irc
Attendees
Present
cpn, Dan_Druta, Dario_Sabella, dom, Doug_Eng, Eric_Siow,
Jonas_Svennebring, Jordi_Gimenez, Louay_Bassbouss,
Piers_O_Hanlon, sudeep, Tarun_Bansal
Regrets
-
Chair
DanD, Song, Sudeep
Scribe
dom
Contents
* [4]Meeting minutes
Meeting minutes
[5]Slides: Network Quality Estimation In Chrome
[5]
https://lists.w3.org/Archives/Public/public-networks-ig/2020Jan/att-0003/Network_Quality_Estimation_in_Chrome.pdf
Sudeep: today's session is an important one for the IG
… in the past, we've covered a lot about MEC, CDN, network
prediction
… today we have folks from Google's Chrome team who implemented
some APIs around networking
… we're glad to have our guest speaker Tarun Bansal from the
Chrome Team to give us insights about the APIs implemented in
the networking space; how it is used, how useful it is, what
lessons to draw from it
Tarun: I work on the Google Team and will talk about network
quality estimation in Chrome
… the talk is divided in 2 parts: use cases, and then technical
details about how it works
… my focus in the Chrome team is on networking and web page
loading
… I focus on the tail end of performance, very slow connections
e.g. 3G
… about 20% of page loads happen on 3G-like connections - which
feels very slow, e.g. 20s before first content
… videos would also take a lot of buffering time in these
circumstances
… the 3G share varies from market to market; e.g. 5% in the US,
but up to 40% in e.g. developing countries
… We have a service that provides continous estimates of
network quality, covering RTT and bandwidth
… we estimate network quality across all the paths, not
specific to a single web servers
… this focuses on the common hop from browser to network
carrier
… [this work got help from lots of folk, esp. Ben, Ilya, Yoav]
… Before looking at the use cases, we need to understand how
browsers load Web pages and why Web pages load are slow on slow
connections
… First, it is very challenging to optimize performance of Web
pages - takes a lot of resoruces
… Web pages typically load plenty of resources before showing
any content (e.g. css, js, images, ...)
… Not all of these resources are equally important - some have
no UX impact (e.g. tracking, below-the-fold content)
… loading everything in parallel works fine in fast connection,
but in slow connections, it slows everything down
… an optimal web page load should keep the network pipe full
and should a lower-priority-resource should not slow down a
higher-priority resource
… e.g. loading a below-the-fold image should not slow down
what's needed to show the top of the page
… or a JS-for-ad shouldn't slow the core content of the page
… this means a browser need to understand the network capacity
to optimize loading of resources
… this is what led to the creation of this network quality
estimation service
… Other uses include so called "browser interventions" which
are meant to improve the overall quality of the Web by
deviating from standard behavior in specific circumstances
… in our case, e.g. when a network is very slow
… another use case is to feed back to the network stack - e.g.
using network timeouts
… in the future, this could also be used to set an initial
timeout in a smarter way (e.g. higher timeout in poor
connection contexts)
… lots of use cases for the browser vendor - what use would Web
dev make of it?
… We've exposed a subset of these values to the developers: RTT
estimate, a bandwidth estimate, and a rough-categorization of
network quality (in 4 values)
… This was released in 2016
… and is being used in around ~20% of web pages across all
chrome platforms
… examples of usage:
… the Shaka player (an open source video player) use the
network quality API to adjust the buffer; Facebook does this as
well
… some developers use it to inform the user that the slow
connection will impact the time needed to complete an action
… Now looking at the details of the implementation
… The first thing we look at is the kind of connection (e.g.
wifi)
… but that's not enough: there can be slow connections even on
Wifi or 4G
… a challenge in implementation this API is being able to make
it work on all the different platforms which expose very
different set of APIs
… We also need to make it work on devices as they are, with
often very limited access to the network layer
… Typically, network quality is estimated by sending echo
traffic to a server (e.g. speedtest)
… but this isn't going to work for Chrome: privacy (don't want
to send data to a server without user intent)
… also don't want to maintain a server for this
… we also want to make the measurement available to other
Chromium-based browsers
… so we're using passive estimation
… for RTT, we use 3 sources of information based on the
platform
… the first is the HTTP layer which Chrome controls completely
… the 2nd is the transport layer (TCP) for which some platforms
provide information
… the 3rd is the SPDY/HTTP2 and QUIC/HTTP3 layers
… for HTTP, you measure the RTT by the time different between
request and response - this is available on all platforms,
completely within the Chrome codebase
… there are limitations: the server processing time is included
in the measurement
… for H2 and QUIC connections, the requests are serialized on
the same TCP or UDP request, which means the HTTP request can
be queued behind other requests
… which may inflate the measured RTT
… it is mostly useful as an upper bound
… for the TCP layer, we look at all the TCP sockets the browser
has happened, and ask the kernel what RTT it has computed for
these sockets
… then we take a median
… this is less noisy, but it still has its own limitations
… it doesn't take into account packet loss; it doesn't deal
with UDP sockets (e.g. if using QUIC)
… and it's only available on some platforms - we can't do this
on Windows or MacOS
… this provides a lower bound RTT estimate
… The 3rd source is the QUIC/HTTP2 Ping
… Servers are expected to respond immediately to HTTP2 PING
… this is available in Chrome, and it removes some of the
limitations we discussed earlier
… but not all servers support QUIC/H2, esp in some countries
… not all servers that support QUIC/H2 support PING despite the
spec requirement
… and it can still be queued behind other packets
… So we have these 3 sources of RTT, we take for each sources
all the samples, and we aggregate them with a weighted median
… we give more weight to the recent samples; compared to TCP
which uses weighted average, we use weighted median to
eliminate outliers
… once we have these 3 values, we combine them using heuristics
to a single value
… these heuristics will vary from platform to platform
… Is that RTT enough?
… We have found that to estimate the real capacity, we need to
estimate the bandwidth
… there has been a lot of research on this, but none of them
worked well for our use case
… we do not want to check a server; we want a passive estimate
… What are the challenges in estimating bandwidth? The first
one is that we don't have cooperation from the server-side
… e.g. we don't know what TCP flavor the server is using, we
don't know their packet loss rates
… so we use a simple approach: we measure how many bytes we get
in a given time window with well defined properties (e.g.
>128KB large, 5+ active requests)
… the goal being to ensure the network is not under-utilized
… with all these estimates, how do they quickly adapt to
changing network conditions?
… e.g. entering in a parking will slow down a 4G connection
… we use the strength of the wireless signals
… we also store information on well-known networks
… To summarize, there are lots of use cases for knowing network
quality - not just for browsers, also for Web developers
… but there are lots of technical challenges from doing that
from the app layer without access to the kernel layer
Piers: (BBC) I heard Yoav mention in the IETF that the netinfo
RTT exposure might go away for privacy reasons
… that was back at the last IETF meeting last year
Tarun: it's not clear if we should expose a continuous
distribution of RTT - a more granular exposure could work
Piers: so this is an ongoing discussion - can you say more
about the privacy concerns?
Tarun: 2 concerns: one is fingerprinting
… we round and add noise to the values to reduce fingerprint
… another concern is that a lot of Web developers may not know
how to consume continuous values
… simplifying it make it easier to consume
… we provide this in the Effective Connectivity Type - which
can be easier to use to e.g. pick which image to load
Piers: we have ongoing work on TransportInfo in IETF that is
trying to help with this
Tarun: if the server can identify the network quality and send
it back to the browser, the browser could it more broadly
<piers> [6]https://github.com/bbc/
draft-ohanlon-transport-info-header/blob/master/
draft-ohanlon-transport-info-header.md
[6]
https://github.com/bbc/draft-ohanlon-transport-info-header/blob/master/draft-ohanlon-transport-info-header.md
Piers: one of the use cases is adaptive video streaming; could
also useful for small object transports (which are hard to
estimate in JS)
Tarun: is is mostly for short burst of traffic?
Piers: it's also for media as well
Tarun: so would the server keep data on typical quality from a
given IP address?
Piers: it would be sent with a response header (e.g. along with
the media)
DanD: (AT&T) for IETF QUIC, are you considering using the spin
bit that is being specified?
Tarun: we're not using it, and I don't think there are plans to
use it at the moment
… QUIC itself maintains an RTT estimate which we're using
Dom: has there been work around network quality prediction - we
have a presentation from an Intel team on the topic back in Sep
Tarun: not at the moment - we're relying on what the OS
provides
Jonas: what we're doing for network prediction is to use info
coming from the network itself (e.g. load shifting across
cells)
… we use this to do forward-looking prediction
Tarun: the challenge is that this isn't available at the
application layer
… e.g. they wouldn't be exposed to the Android APIs
… an app wouldn't know the tower location - you can know which
carrier it is, but not more than that
… there is a also a lot variation across Android flavors
… the common set is mostly signal strength and carrier
identifier
Sudeep: would it be interesting for the browser which talks to
the browser to talk to interfaces to the carrier network (e.g.
via MEC)?
… The carrier/operating networks may have more info about the
channel conditions
Tarun: definitely yes
… Android has an API which exposes this information
… but it never took off, and most device manufacturers don't
support it
… there is a way to expose this in Android
… I'm not sure what the practical concerns were, but it never
took off
… it would be super-useful if it was available
Sudeep: you spoke about RTT, bandwidth that got defined in W3C
… but implementations can vary from one browser to another - is
there any standardization about how these would be measured, or
would this be UA dependent?
Tarun: it's spec as a "best-effort estimate" from the browser,
so it's mostly up to the browser
… right now it's only available in Chromium-based browsers
… even Chromium-based implementations will vary from platform
to platform
Dom: can you say more about the fact that is is not available
in other browsers?
Tarun: I think it's a question of priority - we have a lot of
users in developing markets which helped drive some of the
priority for us
Song: (China Mobile) I'm interested in the accuracy of the
network quality monitoring
… you mention aggregating data from 3 sources: HTTP, TCP and
QUIC
… is the weights for these 3 sources fixed, or does it vary
based on the scenario?
Tarun: it's very hard to measure accuracy
… in lab studies (with controlled network conditions), the
accuracy algorithm does quite well
… we also do A/B studies, but it's hard given we don't really
know the ground truth
… so we measure the behavior of the consumer of the API, e.g.
on the overall page load performance
… we've seen 10-15% improvements when tuning the algorithm the
right way
Song: when you measure the data from these 3 sources, are they
exposed to the Web Dev? or only the aggregated value?
… are there any chance to make the raw source data available to
Web browsers?
Tarun: we only provide aggregated values
Piers: how often do you update the value?
Tarun: internally, everytime we send or receive a packet
… we throttle it on the Web API - when the values have changed
by more than 10%
Piers: that's a pretty large margin for adaptation
Tarun: most of the developers don't care about very precise
estimates
… it's pretty hard to write pages that takes into account that
kind of continuous change
Piers: for media, more details are useful
Tarun: even then, you usually only have 2 or 3 resolutions to
adopt to
Piers: but the timing of the adaptation might be sensitive
Piers: Any plans to provide more network info?
Tarun: no other plans as of now
… we're open to it if there are other useful bits to expose
Sudeep: that's one of the topics the group is aiming to build
on
… are there other APIs in this space that you think would be
useful to Web developers?
Tarun: I think most developers care about few different values
… it's not clear they would use very detailed info
… another challenge we see is around caching (e.g. different
network resources for different network quality)
… you might be loading new resources because you're on a
different network quality, which if it is of low quality isn't
counter productive
… In general, server-side estimates are likely more accurate
Sudeep: Thank you Tarun for a very good presentation!
… Going forward, we want to look at how these APIs can and need
to be improved based on Web developers needs
… we'll follow up with a discussion
… Next week we have a presentation by Michael McCool on Edge
computing - how to offload computing from a browser to the edge
using Web Workers et al
… call info will be sent to the list
Received on Wednesday, 5 February 2020 15:10:38 UTC