- From: Yoav Weiss <yoav@yoav.ws>
- Date: Thu, 18 Jun 2020 12:47:30 +0200
- To: Benjamin Kaduk <kaduk@mit.edu>
- Cc: The IESG <iesg@ietf.org>, draft-ietf-httpbis-client-hints@ietf.org, httpbis-chairs@ietf.org, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>, Mark Nottingham <mnot@mnot.net>
- Message-ID: <CACj=BEjaVypTya2o-3Hb8r=iwxu9PDRiLmXeHLDFkoMdQobGZw@mail.gmail.com>
On Wed, Jun 17, 2020 at 8:41 PM Benjamin Kaduk <kaduk@mit.edu> wrote: > On Wed, Jun 17, 2020 at 10:47:34AM +0200, Yoav Weiss wrote: > > Thanks for reviewing and apologies for the delayed reply :/ > > > > Comments addressed below and incorporated into > > https://github.com/httpwg/http-extensions/pull/1220 > > Your review would be appreciated :) > > > > On Tue, May 19, 2020 at 10:56 PM Benjamin Kaduk via Datatracker < > > noreply@ietf.org> wrote: > > > > > Benjamin Kaduk has entered the following ballot position for > > > draft-ietf-httpbis-client-hints-14: No Objection > > > > > > When responding, please keep the subject line intact and reply to all > > > email addresses included in the To and CC lines. (Feel free to cut this > > > introductory paragraph, however.) > > > > > > > > > Please refer to > https://www.ietf.org/iesg/statement/discuss-criteria.html > > > for more information about IESG DISCUSS and COMMENT positions. > > > > > > > > > The document, along with other ballot positions, can be found here: > > > https://datatracker.ietf.org/doc/draft-ietf-httpbis-client-hints/ > > > > > > > > > > > > ---------------------------------------------------------------------- > > > COMMENT: > > > ---------------------------------------------------------------------- > > > > > > Section 1 > > > > > > There are thousands of different devices accessing the web, each > with > > > different device capabilities and preference information. These > > > device capabilities include hardware and software characteristics, > as > > > well as dynamic user and user agent preferences. Historically, > > > > > > nit: should "user-agent" be hyphenated? > > > > > > > In web specifications it typically isn't > > <https://infra.spec.whatwg.org/#user-agent>. RFC 7231 > > <https://tools.ietf.org/html/rfc7231> also doesn't seem to hyphen it. > > [I guess I should have mentioned "compound adjective" here rather than > below, whoops.] > > > > > > applications that wanted to allow the server to optimize content > > > delivery and user experience based on such capabilities had to rely > > > on passive identification (e.g., by matching the User-Agent header > > > > > > nit: it feels like "allow the server" would be something that involves > > > granting permission or the client sending an active signal (as proposed > > > by this document), as opposed to just the apaplication that "wanted the > > > server to optimize" and had to make do with such limited signal as was > > > already available. > > > > > > > OK. Removing "allow the". > > > > > > > > > > field (Section 5.5.3 of [RFC7231]) against an established database > of > > > user agent signatures), use HTTP cookies [RFC6265] and URL > > > > > > nit: hyphenate user-agent again, used as an adjective. > > > > > > > TIL: compound adjective > > <https://www.grammarbook.com/punctuation/hyphens.asp#:~:text=Rule%201 > .,is%20called%20a%20compound%20adjective.&text=When%20a%20compound%20adjective%20follows,hyphen%20is%20usually%20not%20necessary.> > > Done! > > > > > > > > o User agent detection cannot reliably identify all static > > > variables, cannot infer dynamic user agent preferences, requires > > > external device database, is not cache friendly, and is reliant > on > > > > > > nit: singular/plural mismatch ("an external device database" or > > > "external device databases") > > > > > > > Done > > > > > > > > o Cookie-based approaches are not portable across applications and > > > servers, impose additional client-side latency by requiring > > > JavaScript execution, and are not cache friendly. > > > > > > (I think I missed a step in why a cookie-based approach inherently > > > requires javascript execution, though maybe it doesn't matter.) > > > > > > > Essentially, if you want to dynamically set your cookies based on > > client-side information, you need javascript to do that. > > Ah, I think I am starting to see, now. I had in my head a more simplistic > model where "user-agent sends a bunch of headers to the server, and the > server puts the result of its analysis in a cookie", which doesn't really > stand up to detailed scrutiny. > > > > > > Proactive content negotiation (Section 3.4.1 of [RFC7231]) offers an > > > alternative approach; user agents use specified, well-defined > request > > > headers to advertise their capabilities and characteristics, so that > > > > > > Chasing the reference, it's not clear that it supports quite this > strong > > > of a statement: in addition to the explicit negotiation fields, it also > > > allows using implicit characteristics such as client IP address and > > > User-Agent. > > > > > > > Would ending that section with the following work? > > ", so that servers can select (or formulate) an appropriate response, > based > > on those request headers (or on other, implicit characteristics)." > > Yes, that would help, thanks. > > > > > > Section 2.1 > > > > > > access of third parties to those same header fields. Without such > an > > > opt-in, user agents SHOULD NOT send high-entropy hints, but MAY send > > > low-entropy ones [CLIENT-HINTS-INFRASTRUCTURE]. > > > > > > It looks like the reference only defines a registry for low-entropy > > > hints, and we are inferring that any hints not listed in that table are > > > to be treated as "high-entropy". Perhaps we could reword both > > > directions of this directive to refer only to the registry of > > > low-entropy hints (e.g., "SHOULD NOT send hints that are not listed in > > > [registry]")? > > > > > > > Makes sense. > > > > > > > > > > Implementers need to be aware of the passive fingerprinting > > > implications when implementing support for Client Hints, and follow > > > the considerations outlined in the Security Considerations > > > (Section 4) section of this document. > > > > > > side note: in some sense the Accept-CH mechanism transforms it from a > > > passive to an active fingerprinting mechanism. > > > > > > > Good point! Removed "passive" here. > > > > > > > > > > Section 2.2 > > > > > > information in them. When doing so, and if the resource is > > > cacheable, the server MUST also generate a Vary response header > field > > > (Section 7.1.4 of [RFC7231]) to indicate which hints can affect the > > > selected response and whether the selected response is appropriate > > > for a later request. > > > > > > side note: I suspect the answer I want is already present with a > > > detailed reading of RFC 7231, but I wonder if it's worth saying > > > something here about whether the Vary response header could/should > > > include registered client hint header field names that were not present > > > in the request in question. > > > > > > > https://tools.ietf.org/html/rfc7231#section-7.1.4 implies that Vary can > be > > set to header names that are missing from the request. ("or lack > thereof") > > I'm not sure we should mention that explicitly here. > > Ah, thanks. > > > > > > Section 3.1 > > > > > > Based on the Accept-CH example above, which is received in response > > > to a user agent navigating to "https://example.com", and delivered > > > over a secure transport, a user agent will have to persist an > Accept- > > > CH preference bound to "https://example.com". It will then use it > > > > > > What level of requirement is implied by "will have to" here? IIUC, > it's > > > just that "if anything is persisted, it must be keyed on" but with no > > > obligation to do any persistence. If so, perhaps a wording like "any > > > persisted Accept-CH preference will be bound to" would be better? > > > > > > > The normative requirement in the paragraph above it is SHOULD. > > I'll modify the wording to your suggested one. > > > > > > > > > > for navigations to e.g. "https://example.com/foobar.html", but not > to > > > e.g. "https://foobar.example.com/". It will similarly use the > > > preference for any same-origin resource requests (e.g. to > > > > > > nit: comma after "e.g." (throughout). > > > > > > > OK > > > > > > > > > > "https://example.com/image.jpg") initiated by the page constructed > > > from the navigation's response, but not to cross-origin resource > > > requests (e.g. "https://thirdparty.com/resource.js"). This > > > preference will not extend to resource requests initiated to > > > "https://example.com" from other origins (e.g. from navigations to > > > "https://other-example.com/"). > > > > > > Perhaps thirdparty.example and other.example, to stay within the BCP32 > > > space? > > > > > > > Done > > > > > > > > > > Section 3.2 > > > > > > When selecting a response based on one or more Client Hints, and if > > > the resource is cacheable, the server needs to generate a Vary > > > response header field ([RFC7234]) to indicate which hints can affect > > > the selected response and whether the selected response is > > > appropriate for a later request. > > > > > > Is BCP 14 language approprite here? > > > > > > > Indeed. Changed to SHOULD. > > > > > > > Above example indicates that the cache key needs to include the Sec- > > > CH-Example header field. > > > > > > nit: please add the article "the" to make this a complete sentence. > > > > > > > Yup > > > > > > > > > > Section 4 > > > > > > While I don't expect that I can tell the major browser vendors anything > > > new about the privacy considerations to client hints, I do think that > we > > > should give some guidance to implementors of other HTTP clients, who > may > > > not have such extensive depth of knowlege, on the general landscape in > > > which this mechanism is set. The subsections hereof do a great job > > > covering a lot of relevant details and specific factors to consider; > > > thank you! I think it may also be appropriate to have some more > generic > > > lead-in text, noting that in the worst case, merely converting a > passive > > > fingerprinting mechanism to an active fingerprinting mechanism with > > > server opt-in does not actually provide any privacy benefit (the worst > > > case being when all servers ask for all the data and clients accede)! > > > While we might hope that the need to jump through an extra hoop to > > > access fingerprinting information might dissuade some servers from > > > asking for it, it seems imprudent to assume that it will happen, so in > > > order to obtain real privacy benefit there needs to be some additional > > > policy controls in the client and in what hints are > defined/implemented. > > > As I mentioned already, we already have a lot of the details for how to > > > apply such policy controls, and limitations to only define hints that > > > expose information already available in other means; what I'd like to > > > see is the high-level picture that ties them together. > > > > > > > > OK. Added something. I'd appreciate your review to see if it matches what > > you had in mind. > > > > > > > Section 4.1 > > > > > > upon it. The header-based opt-in means that we can remove passive > > > fingerprinting vectors, such as the User-Agent string (enabling > > > active access to that information through User-Agent Client Hints > > > [4]), or otherwise expose information already available through > > > > > > I think this [4] is the same as [UA-CH]. > > > > > > > It's pointing to a specific section of UA-CH. I'm not sure if this is > > critical. > > I'm not, either; let's leave it to the RFC Editor. > > > > > > > > > Also, use of the first person ("we") is somewhat unusual in RFC style. > > > > > > > Changed. > > > > > > > > > > Therefore, features relying on this document to define Client Hint > > > headers MUST NOT provide new information that is otherwise not > > > available to the application via other means, such as existing > > > request headers, HTML, CSS, or JavaScript. > > > > > > As written, this is a fairly weird condition. What constitutes > > > "available to the application via other means"? Does "put up an > > > interstitial until the user provides the information in question" > count? > > > > > > > Changed to "not made available to the application by the user agent" > > > > > > > > > > o Entropy - Exposing highly granular data can be used to help > > > identify users across multiple requests to different origins. > > > Reducing the set of header field values that can be expressed, or > > > restricting them to an enumerated range where the advertised > value > > > is close but is not an exact representation of the current value, > > > > > > nit: "close to" seems like it would scan better. > > > > > > > Yup > > > > > > > > > > Different features will be positioned in different points in the > > > space between low-entropy, non-sensitive and static information > (e.g. > > > user agent information), and high-entropy, sensitive and dynamic > > > information (e.g. geolocation). User agents need to consider the > > > value provided by a particular feature vs these considerations, and > > > MAY have different policies regarding that tradeoff on a per-feature > > > basis. > > > > > > How about on a per-origin basis (and, e.g., domain reputation)? An > > > "entropy budget" where an origin that asks for too many distinct hints > > > won't get all of them? > > > > > > > Those are definitely policies that user agents can apply (e.g. one > concrete > > proposal that looks a lot like your "entropy budget" is > > https://github.com/bslassey/privacy-budget) > > Maybe "per-feature or other fine-grained basis"? Just a thought, and I > don't mind leaving it as-is. > Makes sense. Added! > > > > > > (I also wonder if a descriptive "may wish to have" is better than the > > > normative "MAY", here.) > > > > > > > Sure. > > > > > > > > o Implementers SHOULD restrict delivery of some or all Client Hints > > > header fields to the opt-in origin only, unless the opt-in origin > > > has explicitly delegated permission to another origin to request > > > Client Hints header fields. > > > > > > Am I reading things right that this document does not define any such > > > delegation mechanisms but is just admitting the possibility of such > > > mechanisms being defined in the future? I'd suggest clarifying up in > > > ยง2.1 with a parenthetical (akin to the "outlined below" note about the > > > opt-in mechanism). > > > > > > > Added an "(as outlined in {{CLIENT-HINTS-INFRASTRUCTURE}})" clarification > > to 2.1 > > > > > > > Implementers SHOULD support Client Hints opt-in mechanisms and MUST > > > clear persisted opt-in preferences when any one of site data, > > > browsing history, browsing cache, cookies, or similar, are cleared. > > > > > > Who is the target audience for this SHOULD? If it's just "people > > > implementing this document", it seems ineffectual, and if it's any > > > broader scope it seems unenforcable. > > > > > > > Removed the SHOULD here as it's already defined elsewhere that high > entropy > > hints require an opt-in. > > Also changed "implementers" to "user agents". > > > > > > > Section 4.3 > > > > > > Research into abuse of Client Hints might look at how HTTP responses > > > that contain Client Hints differ from those with different values, > > > > > > nit: what are "responses that contain Client Hints"? We have discussed > > > Accept-CH header fields in responses, and client hints in requests, but > > > the only mention I recall of hints in responses was in the Vary header > > > field, and it's not clear that that is what was intended. > > > > > > > Good catch! Changed to "responses to requests that contain Client Hints". > > > > > > > Section 5 > > > > > > While HTTP header compression schemes reduce the cost of adding HTTP > > > header fields, sending Client Hints to the server incurs an increase > > > in request byte size. Servers SHOULD take that into account when > > > > > > nit: I wonder if this would be more clear as: > > > > > > % Sending Client Hints to the server incurs an increase in request byte > > > % size. Some of this increase can be mitigated by HTTP header > > > % compression schemes, but each new hint will still lead to some > > > % increased bandwidth usage. Servers SHOULD [...] > > > > > > > Changed. > > > > > > > > Section 7.1 > > > > > > I'm not sure I understand why [FETCH] is listed as a normative > > > reference. > > > > > > > Moved it to be informative. > > > > > > > > > > I find it amusing that we reference both 7231 and 7234 for Vary, though > > > to my untrained eye the current references both seem appropriate in > > > their respective locations. > > > > > > Section 7.2 > > > > > > If [CLIENT-HINTS-INFRASTRUCTURE] is to be the source of truth for > > > low-entropy (and, by deduction) high-entropy hints, it seems like it > > > should be normative. > > > > > > > Moved. > > Thanks for the updates! > I will take a look at the github PR now. > > -Ben >
Received on Thursday, 18 June 2020 10:48:03 UTC