- From: Carlo Cannas <notifications@github.com>
- Date: Sun, 18 Aug 2024 17:12:29 -0700
- To: whatwg/fetch <fetch@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/fetch/issues/1768@github.com>
### What is the issue with the Fetch Standard?
https://fetch.spec.whatwg.org/commit-snapshots/4cb3cf21946113c0684f04122dd95315fd10c567/#header-value-get-decode-and-split
The algorithm doesn't seem to return the same output of the first [example](https://fetch.spec.whatwg.org/commit-snapshots/4cb3cf21946113c0684f04122dd95315fd10c567/#example-header-list-get-decode-split).
The example says that giving `nosniff,` as input we should get a list of two strings: `"nosniff"` and an empty string `""`.
If I try to follow the steps however I get a list of a single `"nosniff"` string and no empty string.
Here's how I'm interpreting it:
> 1. Let input be the result of [isomorphic decoding](https://infra.spec.whatwg.org/#isomorphic-decode) value.
> 2. Let position be a [position variable](https://infra.spec.whatwg.org/#string-position-variable) for input, initially pointing at the start of input.
> 3. Let values be a [list](https://infra.spec.whatwg.org/#list) of [strings](https://infra.spec.whatwg.org/#string), initially empty.
> 4. Let temporaryValue be the empty string.
```
input: "nosniff,"
position: ^
values: []
temporaryValue: ""
```
> 5. While position is not past the end of input:
> 1. Append the result of [collecting a sequence of code points](https://infra.spec.whatwg.org/#collect-a-sequence-of-code-points) that are not U+0022 (") or U+002C (,) from input, given position, to temporaryValue.
```
input: "nosniff,"
position: ^
values: []
temporaryValue: "nosniff"
```
> 5.
> 2. If position is not past the end of input, then:
> 1. ~If the [code point](https://infra.spec.whatwg.org/#code-point) at position within input is U+0022 ("), then:~
> 1. ~Append the result of [collecting an HTTP quoted string](https://fetch.spec.whatwg.org/commit-snapshots/4cb3cf21946113c0684f04122dd95315fd10c567/#collect-an-http-quoted-string) from input, given position, to temporaryValue.~
> 2. ~If position is not past the end of input, then [continue](https://infra.spec.whatwg.org/#iteration-continue).~
> 2. Otherwise:
> 1. Assert: the [code point](https://infra.spec.whatwg.org/#code-point) at position within input is U+002C (,).
> 2. Advance position by 1.
```
input: "nosniff,"
position: ^
values: []
temporaryValue: "nosniff"
```
> 5.
> 2.
> 3. Remove all [HTTP tab or space](https://fetch.spec.whatwg.org/commit-snapshots/4cb3cf21946113c0684f04122dd95315fd10c567/#http-tab-or-space) from the start and end of temporaryValue.
> 4. [Append](https://infra.spec.whatwg.org/#list-append) temporaryValue to values.
```
input: "nosniff,"
position: ^
values: ["nosniff"]
temporaryValue: ""
```
> 5.
> 2.
> 5. Set temporaryValue to the empty string.
Here position is past the end of input, the while condition is not satisfied and this returns a list with just a single element: `["nosniff"]`.
> 6. Return values.
The examples seem right to me, I think the algorithm is wrong.
I found some WPT that indirectly test this algorithm ([via `Content-Length`](https://fetch.spec.whatwg.org/#content-length-header)), Chrome and Firefox seem to follow the behavior of the example.
https://github.com/web-platform-tests/wpt/blob/merge_pr_47660/fetch/content-length/resources/content-lengths.json#L66-L69
https://wpt.fyi/results/fetch/content-length/parsing.window.html
I.e.: a header value with a comma as the last character is split into two values, as in the example. Since the header is `Content-Length` and the spec mandates to abort if the split values don't match, the test passes if [the request](https://wpt.live/fetch/content-length/resources/content-length.py?length=Content-Length%3A%2030%2C) results in a network error.
This should be the relevant Chrome implementation, it looks like it internally stores every header splitted
https://source.chromium.org/chromium/chromium/src/+/main:net/http/http_response_headers.cc;l=967-970;drc=82dff63dbf9db05e9274e11d9128af7b9f51ceaa;bpv=0;bpt=1
https://source.chromium.org/chromium/chromium/src/+/main:net/http/http_util.cc;l=968;drc=82dff63dbf9db05e9274e11d9128af7b9f51ceaa
Firefox on the other hand doesn't seem to do that, the `Content-Length` check appears to be implemented using a simpler split by just commas, ignoring quotes. It makes sense: a quote char inside `Content-Length` already makes it invalid so there's no need to care about quotes while splitting.
https://searchfox.org/mozilla-central/rev/53e68046298557fae0c922230b595bb6689bf587/netwerk/protocol/http/nsHttpHeaderArray.cpp#173-189
https://searchfox.org/mozilla-central/rev/53e68046298557fae0c922230b595bb6689bf587/netwerk/protocol/http/nsHttpHeaderArray.h#308
--
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/fetch/issues/1768
You are receiving this because you are subscribed to this thread.
Message ID: <whatwg/fetch/issues/1768@github.com>
Received on Monday, 19 August 2024 00:12:33 UTC