- From: Kazuho Oku <kazuhooku@gmail.com>
- Date: Sun, 15 Aug 2021 15:20:16 +0900
- To: Robin MARX <robin.marx@uhasselt.be>
- Cc: IETF QUIC WG <quic@ietf.org>, HTTP Working Group <ietf-http-wg@w3.org>, Jana Iyengar <jri.ietf@gmail.com>
- Message-ID: <CANatvzx_O_38nU3wyD6UCtFRfBSarT4=NO45yOQMbSOe0oCK=g@mail.gmail.com>
Hello Robin, Thank you for your comments. My responses inline. 2021年8月14日(土) 1:28 Robin MARX <robin.marx@uhasselt.be>: > Hello Kazuho, > > Thanks a lot for writing this up and sharing it. > This concept has been part of the qlog docs since the very start [1], but > has simultaneously been something that no-one has implemented yet, so it's > good to see a concurrent proposal and an actual POC that seems to work > quite nicely! > I think this capability will be core to allow people to diagnose QUIC in > the wild, as we've seen just this week with some of the problems Wix had to > properly identify H3 vs H2 gains on their Fastly deployment [2]. > > A couple of questions/remarks: > > 1) The trace stream currently starts from when the request for the > well-known URL is received, thus missing all events before that (e.g., > handshake details). > I understand this is a tradeoff (you don't want to keep full traces > for all connections just in case they are requested), but I feel we might > want a way to indicate at the very start (e.g., transport parameter) that a > trace will be requested so we can request that info as well? > That's a good point. While our PoC starts collecting the trace from the moment when the server receives the request to the well-known URI, I do not think we are tied to doing that. And I think that we probably do not need a TP. A server can allocate a fixed-size buffer for each connection, retaining the first N events that occurred for that connection. When it receives a request for the well-known URI, it could send those events being recorded and then send the events that follow. Recording few events during the startup of a connection is probably alright, as the cost of logging would be negligible compared to that of the TLS handshake. > 2) There are downsides to loading the trace over the same connection, some > of which you note in the text and in the email (overhead for the "real" > connection). The qlog text allows fetching of traces of connection A over > connection B by CID, but that of course has other tradeoffs. > I feel your method is probably best, IF we can find a solution for > point 1). e.g., if you don't want to impact real connection, you request > trace at start, let the connection run, and then fetch full trace at the > end instead of streaming during. > This does introduce some extra (resource exhaustion) attack vectors > that need mitigations etc. > > 3) Relatedly, your POC seems to assume the browser will have just a single > connection and a trace request in a second tab will auto map to that > connection. > That works fine for the POC, but for a real deployment that lets > end-users fetch traces, you'd need built-in browser/client support to > select a specific connection / fetch traces for all connections to a given > origin. > Not a big problem ofc, and things like Chrome's netlog export already > do this, but still a practical hurdle. > Right. I would hope that it would be possible to implement this as a browser extension at least (with the assumption being that requests from a browser extension would be coalesced with other requests going to the same authority). > 4) Any reason in particular you're not streaming qlog? I assume it's > because you don't log qlog directly but instead use a converter and that > converter adds too much overhead to do on the fly? > Not that it really matters or that I feel we should limit to qlog, but > it does bring up the question of how an automated client setup (e.g., via > WebPageTest-alike tooling) would identify what the server sends back. > You of course already know this because you made the qlog issue for it > [3], but good to bring it up on the list as well. > I would argue that there are differences between tracing a program and an interchange format being used for analysing transport issues. What we emit is the trace of h2o, that *can* be converted to qlog. To give an example, we might have a call graph of functions like this: quic::on_quic_ack // processing of a QUIC ACK frame -> quic::on_quic_ack_one_pn // processing of a particular packet number being acked -> h3::on_buffer_shift // some bytes are removed from the send buffer -> proxy::on_upstream_unblock // the proxy is unblocked from reading more data from upstream -> proxy::read_data // proxy reads data from upstream, queued in the receive buffer -> h3::notify_data_ready // h3 layer is notified that there is more data to be sent -> quic::notify_data_ready // QUIC stack is notified that there is more data to be sent and for the purpose of analysis, we want to emit traces that preserve these kinds of call graphs, or to paraphrase, log events that happen in their order, regardless of where they happened. In this example, H3- and proxy-level events can happen for each PN being ACKed. However, I do not recall if it was possible with qlog to emit an H3 event while processing an ACK frame. I could well be wrong about how we could use qlog, but regardless, the broader point is that we do not want our tracing capabilities to be constrained by the limits of qlog. Emitting traces our own way preserves the most information, with minimal effort. If necessary, we can post-process the traces to qlog format to use the tools developed by the community. 5) I wonder if this should be a completely separate document, a separate > document part of the qlog effort, or part of the qlog documents. > As said in 4), I don't feel this necessarily should be limited to > qlog, but a lot of the privacy issues+mitigations inherent to exposing logs > will be discussed for qlog and should probably be referenced for this > approach as well. > Currently, you seem to skirt some of this by saying it doesn't matter > because the client is the one requesting the logs, but I don't quite agree > that's enough. > This could be used to ask end-users to capture a trace of a > problematic connection and upload it for analysis. If the end-user isn't > very technical, they might not know which info they're exposing. Even if > they are, they probably don't want to go through the trouble of sanitizing > the logs themselves. > Put differently: this should probably either be restricted to expose > no privacy-sensitive info at all (or at least discuss the issues) or allow > explicit selection of a "privacy level" (the approach we'll probably take > with qlog is to define multiple levels of obfuscation/omission for > different use cases). > > I am very excited by the proposal and would love for some large > deployments to offer this service. > I feel that wouldn't just be revolutionary to many academic efforts, but > also enable better client-aided debugging and to allow users to assess > bottlenecks in their setups. > > With best regards, > Robin > > [1]: > https://datatracker.ietf.org/doc/html/draft-ietf-quic-qlog-main-schema-00#section-7.2 > [2]: https://twitter.com/alonkochba/status/1424403252284694528?s=20 > [3]: https://github.com/quicwg/qlog/issues/158 > > > On Fri, 13 Aug 2021 at 08:15, Kazuho Oku <kazuhooku@gmail.com> wrote: > >> Hello folks, >> >> Today Jana and I have submitted a tiny I-D called >> draft-kazuho-httpbis-selftrace. >> >> The draft specifies a well-known URI to be used for providing a trace of >> a particular HTTP/3 connection (e.g., qlog) on that same HTTP/3 connection. >> >> One of the biggest hurdles in analyzing HTTP/3 performance issues is >> obtaining traces that show the symptoms. That is because clients being >> affected by issues have to coordinate with the server operators to collect >> the traces. >> >> This PR solves the problem by defining a well-known URI for serving a >> trace to the client on the HTTP connection that the client is using. When a >> user sees an issue, they can collect the traces themselves and provide it >> to the server operator. >> >> We have already implemented the feature in h2o, and doing so was easy, >> assuming that the underlying QUIC stack already defines callbacks for >> collecting trace events, see lib/handler/self_trace.c of >> https://github.com/h2o/h2o/pull/2765. >> >> We also have a public endpoint; to try it out, first open >> https://ora1.kazuhooku.com/test/self-trace/video-only.html (which starts >> streaming a video), then open >> https://ora1.kazuhooku.com/.well-known/self-trace. While the video is >> being served, you would see the trace flowing through the well-known URI. >> >> At the moment, we are using a custom JSON format for the trace, but when >> gzip compression is applied on-the-fly, the overhead of sending a trace >> alongside ordinary HTTP responses is less than 10%. Therefore, we tend to >> believe that this approach would work well in practice. >> >> Please let us know what you think - your feedback is very welcome. >> >> ---------- Forwarded message --------- >> From: <internet-drafts@ietf.org> >> Date: 2021年8月13日(金) 14:53 >> Subject: New Version Notification for >> draft-kazuho-httpbis-selftrace-00.txt >> To: Jana Iyengar <jri.ietf@gmail.com>, Kazuho Oku <kazuhooku@gmail.com> >> >> >> >> A new version of I-D, draft-kazuho-httpbis-selftrace-00.txt >> has been successfully submitted by Kazuho Oku and posted to the >> IETF repository. >> >> Name: draft-kazuho-httpbis-selftrace >> Revision: 00 >> Title: Self-Tracing for HTTP >> Document date: 2021-08-13 >> Group: Individual Submission >> Pages: 5 >> URL: >> https://www.ietf.org/archive/id/draft-kazuho-httpbis-selftrace-00.txt >> Status: >> https://datatracker.ietf.org/doc/draft-kazuho-httpbis-selftrace/ >> Htmlized: >> https://datatracker.ietf.org/doc/html/draft-kazuho-httpbis-selftrace >> >> >> Abstract: >> This document registers a "Well-Known URI" for exposing state of an >> HTTP connection to the peer using formats such as qlog schema [QLOG]. >> >> >> >> >> The IETF Secretariat >> >> >> >> >> -- >> Kazuho Oku >> > > > -- > > dr. Robin Marx > Postdoc researcher - Web protocols > Expertise centre for Digital Media > > *Cellphone *+32(0)497 72 86 94 > > www.uhasselt.be > Universiteit Hasselt - Campus Diepenbeek > Agoralaan Gebouw D - B-3590 Diepenbeek > Kantoor EDM-2.05 > > > -- Kazuho Oku
Received on Sunday, 15 August 2021 06:21:43 UTC