Re: [whatwg/url] Refusing a mix of numeric-only and BIDI domains (#543)

These domains are considered invalid because they don't meet the criteria from [RFC 5893 Section 2](https://tools.ietf.org/html/rfc5893#section-2). Specifically, the label "163" fails criteria 1, which requires the first character of a label to have a Bidi property of L, R, or AL. The digits 0-9 have a Bidi property of EN (European Number) `0030..0039    ; EN # Nd  [10] DIGIT ZERO..DIGIT NINE` according to the [DerivedBidiProps](https://www.unicode.org/Public/13.0.0/ucd/extracted/DerivedBidiClass.txt).

The [domain to ASCII](https://url.spec.whatwg.org/#concept-domain-to-ascii) algorithm sets the `CheckBidi` option to true, which causes the result of Step 2 to return a failure value due to not meeting the above criteria, which is then rejected in Step 3 and ultimately leads to the host parser returning a failure, which then causes the the URL parser to abort.

[RFC 4920 Section 1](https://ietf.org/rfc/rfc3490.html#section-4.1) states:
> If any step of the ToASCII operation
   fails on any label in a domain name, that domain name MUST NOT be
   used as an internationalized domain name.

So, the URL spec is doing the right thing here. The only 2 options for making these domains valid in terms of this spec, as far as I can tell, would be setting the `CheckBidi` option to false or allowing the options for the `Unicode ToASCII` steps to be user configurable.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/543#issuecomment-691716285

Received on Sunday, 13 September 2020 19:47:40 UTC