Re: [whatwg/url] IPv4 host parser + site definition seems potentially dangerous. (#560)

> @MattMenke2 currently the host parser is never invoked with the empty string (note the assert in step 4) and I'm pretty sure that therefore the IPv4 parser isn't either. But also, if it was the empty string, why would it not return that and return 0 instead? (Note that the number parser empty string case can only be reached due to input starting with 0x, 0X, or 0.)

@annevk:  The IPv4 number parser (not the IPv4 parser) seems to be invoked on the empty string in some cases - e.g., in the case of "...." is split into 5 parts, the final empty one is removed, and then we invoke the IPv4 number parser 4 times on empty strings.  The IPv4 number parser is invoked on each on, and is specified to return 0 for the empty string.  No browser actually maps it to "0.0.0.0.", but that seems to be what the spec requires.

> I don't think your suggestion works as we run the IPv4 parser on all domains. Your change would make all domains return failure basically.

I'm not seeing why this is?  My step 4 would "return input" on hostnames that don't end in a number, regardless of number of components.  The IPv4 parser is confusingly written, but its "return input" seems to be supposed to result in continuing host parsing as a non-IPv4 host (The host parser ignores the return value if it's not an IPv4 address, though it's not clear that "returning input" means returning something that is not an IPv4 address).

> The minimum change is probably that whenever we return _input_ now from the IPv4 parser, we first check if _input_'s last label can be parsed as a number (using the IPv4 number parser or something more strict that doesn't do 0x/0X/0 prefixes, though presumably larger than 255 would still be failure) and if so, we return failure instead.

This is exactly what my change to step 4 of the IPv4 parser does, no?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/560#issuecomment-736871382

Received on Tuesday, 1 December 2020 22:54:46 UTC