Re: New Version Notification for draft-kazuho-httpbis-selftrace-00.txt from Robin MARX on 2021-08-13 (ietf-http-wg@w3.org from July to September 2021)

From: Robin MARX <robin.marx@uhasselt.be>
Date: Fri, 13 Aug 2021 18:28:05 +0200
To: Kazuho Oku <kazuhooku@gmail.com>
Cc: IETF QUIC WG <quic@ietf.org>, HTTP Working Group <ietf-http-wg@w3.org>, Jana Iyengar <jri.ietf@gmail.com>
Message-ID: <CAC7UV9aVnrUfvLuMB6dFSqiVzyr5PNF_xc+nRiZve35R3xqyrw@mail.gmail.com>
Hello Kazuho,

Thanks a lot for writing this up and sharing it.
This concept has been part of the qlog docs since the very start [1], but
has simultaneously been something that no-one has implemented yet, so it's
good to see a concurrent proposal and an actual POC that seems to work
quite nicely!
I think this capability will be core to allow people to diagnose QUIC in
the wild, as we've seen just this week with some of the problems Wix had to
properly identify H3 vs H2 gains on their Fastly deployment [2].

A couple of questions/remarks:

1) The trace stream currently starts from when the request for the
well-known URL is received, thus missing all events before that (e.g.,
handshake details).
    I understand this is a tradeoff (you don't want to keep full traces for
all connections just in case they are requested), but I feel we might want
a way to indicate at the very start (e.g., transport parameter) that a
trace will be requested so we can request that info as well?

2) There are downsides to loading the trace over the same connection, some
of which you note in the text and in the email (overhead for the "real"
connection). The qlog text allows fetching of traces of connection A over
connection B by CID, but that of course has other tradeoffs.
    I feel your method is probably best, IF we can find a solution for
point 1). e.g., if you don't want to impact real connection, you request
trace at start, let the connection run, and then fetch full trace at the
end instead of streaming during.
    This does introduce some extra (resource exhaustion) attack vectors
that need mitigations etc.

3) Relatedly, your POC seems to assume the browser will have just a single
connection and a trace request in a second tab will auto map to that
connection.
    That works fine for the POC, but for a real deployment that lets
end-users fetch traces, you'd need built-in browser/client support to
select a specific connection / fetch traces for all connections to a given
origin.
     Not a big problem ofc, and things like Chrome's netlog export already
do this, but still a practical hurdle.

4) Any reason in particular you're not streaming qlog? I assume it's
because you don't log qlog directly but instead use a converter and that
converter adds too much overhead to do on the fly?
    Not that it really matters or that I feel we should limit to qlog, but
it does bring up the question of how an automated client setup (e.g., via
WebPageTest-alike tooling) would identify what the server sends back.
    You of course already know this because you made the qlog issue for it
[3], but good to bring it up on the list as well.

5) I wonder if this should be a completely separate document, a separate
document part of the qlog effort, or part of the qlog documents.
    As said in 4), I don't feel this necessarily should be limited to qlog,
but a lot of the privacy issues+mitigations inherent to exposing logs will
be discussed for qlog and should probably be referenced for this approach
as well.
    Currently, you seem to skirt some of this by saying it doesn't matter
because the client is the one requesting the logs, but I don't quite agree
that's enough.
    This could be used to ask end-users to capture a trace of a problematic
connection and upload it for analysis. If the end-user isn't very
technical, they might not know which info they're exposing. Even if they
are, they probably don't want to go through the trouble of sanitizing the
logs themselves.
    Put differently: this should probably either be restricted to expose no
privacy-sensitive info at all (or at least discuss the issues) or allow
explicit selection of a "privacy level" (the approach we'll probably take
with qlog is to define multiple levels of obfuscation/omission for
different use cases).

I am very excited by the proposal and would love for some large deployments
to offer this service.
I feel that wouldn't just be revolutionary to many academic efforts, but
also enable better client-aided debugging and to allow users to assess
bottlenecks in their setups.

With best regards,
Robin

[1]:
https://datatracker.ietf.org/doc/html/draft-ietf-quic-qlog-main-schema-00#section-7.2
[2]: https://twitter.com/alonkochba/status/1424403252284694528?s=20
[3]: https://github.com/quicwg/qlog/issues/158


On Fri, 13 Aug 2021 at 08:15, Kazuho Oku <kazuhooku@gmail.com> wrote:

> Hello folks,
>
> Today Jana and I have submitted a tiny I-D called
> draft-kazuho-httpbis-selftrace.
>
> The draft specifies a well-known URI to be used for providing a trace of a
> particular HTTP/3 connection (e.g., qlog) on that same HTTP/3 connection.
>
> One of the biggest hurdles in analyzing HTTP/3 performance issues is
> obtaining traces that show the symptoms. That is because clients being
> affected by issues have to coordinate with the server operators to collect
> the traces.
>
> This PR solves the problem by defining a well-known URI for serving a
> trace to the client on the HTTP connection that the client is using. When a
> user sees an issue, they can collect the traces themselves and provide it
> to the server operator.
>
> We have already implemented the feature in h2o, and doing so was easy,
> assuming that the underlying QUIC stack already defines callbacks for
> collecting trace events, see lib/handler/self_trace.c of
> https://github.com/h2o/h2o/pull/2765.
>
> We also have a public endpoint; to try it out, first open
> https://ora1.kazuhooku.com/test/self-trace/video-only.html (which starts
> streaming a video), then open
> https://ora1.kazuhooku.com/.well-known/self-trace. While the video is
> being served, you would see the trace flowing through the well-known URI.
>
> At the moment, we are using a custom JSON format for the trace, but when
> gzip compression is applied on-the-fly, the overhead of sending a trace
> alongside ordinary HTTP responses is less than 10%. Therefore, we tend to
> believe that this approach would work well in practice.
>
> Please let us know what you think - your feedback is very welcome.
>
> ---------- Forwarded message ---------
> From: <internet-drafts@ietf.org>
> Date: 2021年8月13日(金) 14:53
> Subject: New Version Notification for draft-kazuho-httpbis-selftrace-00.txt
> To: Jana Iyengar <jri.ietf@gmail.com>, Kazuho Oku <kazuhooku@gmail.com>
>
>
>
> A new version of I-D, draft-kazuho-httpbis-selftrace-00.txt
> has been successfully submitted by Kazuho Oku and posted to the
> IETF repository.
>
> Name:           draft-kazuho-httpbis-selftrace
> Revision:       00
> Title:          Self-Tracing for HTTP
> Document date:  2021-08-13
> Group:          Individual Submission
> Pages:          5
> URL:
> https://www.ietf.org/archive/id/draft-kazuho-httpbis-selftrace-00.txt
> Status:
> https://datatracker.ietf.org/doc/draft-kazuho-httpbis-selftrace/
> Htmlized:
> https://datatracker.ietf.org/doc/html/draft-kazuho-httpbis-selftrace
>
>
> Abstract:
>    This document registers a "Well-Known URI" for exposing state of an
>    HTTP connection to the peer using formats such as qlog schema [QLOG].
>
>
>
>
> The IETF Secretariat
>
>
>
>
> --
> Kazuho Oku
>


-- 

dr. Robin Marx
Postdoc researcher - Web protocols
Expertise centre for Digital Media

*Cellphone *+32(0)497 72 86 94

www.uhasselt.be
Universiteit Hasselt - Campus Diepenbeek
Agoralaan Gebouw D - B-3590 Diepenbeek
Kantoor EDM-2.05
Received on Friday, 13 August 2021 16:28:31 UTC