Re: [whatwg/url] Refusing a mix of numeric-only and BIDI domains (#543)

I think @TRowbotham has the correct analysis here and indeed it very much depends on how CheckBidi is used.

To simplify from OP:

* `xn----9mcjf9b4dbm09f` is fine in all browsers.
* `1.xn----9mcjf9b4dbm09f` errors (though does not error in Gecko, presumably it has CheckBidi set to false; also doesn't error in Chromium, presumably due to an erroneous ASCII fast path).

However, https://www.rfc-editor.org/rfc/rfc5893.html#section-2 (which UTS46 invokes) also says:

> In a domain name consisting of only LDH labels (as defined in the
      Definitions document [[RFC5890](https://www.rfc-editor.org/rfc/rfc5890)]) and labels that satisfy the rule,
      the requirements of [Section 3](https://www.rfc-editor.org/rfc/rfc5893.html#section-3) are satisfied as long as a label
      that starts with an ASCII digit does not come after a
      right-to-left label.

But that seems contradictory as a label that starts with an ASCII digit can never fulfill The Bidi Rule due to ASCII digits not having the correct Bidi property (they have EN according to https://unicode.org/reports/tr9/):

> The first character must be a character with Bidi property L, R,
       or AL.  If it has the R or AL property, it is an RTL label; if it
       has the L property, it is an LTR label.

I'm not sure what to make of this.

I would appreciate input from @achristensen07 @valenting @markusicu @macchiati @alvestrand. I would be somewhat inclined to set CheckBidi to false given that it matches most implementations, is more likely to match deployed content, and the bidi requirements appear contradictory, but I'm open to suggestions.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/543#issuecomment-1377311486
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/issues/543/1377311486@github.com>

Received on Tuesday, 10 January 2023 13:48:38 UTC