[webauthn] UTF-8 decode should not be required for response.clientDataJSON and cData (#2100) from philomathic_life via GitHub on 2024-07-18 (public-webauthn@w3.org from July 2024)

From: philomathic_life via GitHub <sysbot+gh@w3.org>
Date: Thu, 18 Jul 2024 23:39:26 +0000
To: public-webauthn@w3.org
Message-ID: <issues.opened-2417569449-1721345964-sysbot+gh@w3.org>

zacknewman has just created a new issue for https://github.com/w3c/webauthn:

== UTF-8 decode should not be required for response.clientDataJSON and cData ==
Currently the spec states:

> Let JSONtext be the result of running [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) on the value of response.[clientDataJSON](https://www.w3.org/TR/webauthn-3/#dom-authenticatorresponse-clientdatajson).
>
>Note: Using any implementation of [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) is acceptable as long as it yields the same result as that yielded by the [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) algorithm. In particular, any leading byte order mark (BOM) MUST be stripped.

for [step 5 in Registering a New Credential](https://www.w3.org/TR/webauthn-3/#sctn-registering-a-new-credential) and

>Let JSONtext be the result of running [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) on the value of cData.
>
>Note: Using any implementation of [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) is acceptable as long as it yields the same result as that yielded by the [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) algorithm. In particular, any leading byte order mark (BOM) MUST be stripped.

for [step 8 in Verifying an Authentication Assertion](https://www.w3.org/TR/webauthn-3/#sctn-verifying-assertion).

This seems _slightly_ too strict. While the notes call out stripping a BOM, they also state "yields the _same_ result …" (emphasis added); however [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) requires decoding with the `"replacement"` handler as well.

According to [the serialization of the `CollectedClientData`](https://www.w3.org/TR/webauthn-3/#clientdatajson-serialization), it is impossible for invalid UTF-8 to be generated. This means that RPs should only have to worry about stripping a BOM but _not_ replacing invalid UTF-8 code units with the "replacement character" (i.e., U+FFFD); as the existence of invalid UTF-8 implies the serialization algorithm has not been adhered to as mandated by the spec.

I think this should be clarified.

Please view or discuss this issue at https://github.com/w3c/webauthn/issues/2100 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 18 July 2024 23:39:26 UTC