Re: [whatwg] URL: spec review - basic_parser

On Sat, Oct 11, 2014 at 7:24 PM, Sam Ruby <rubys@intertwingly.net> wrote:
> On 10/10/2014 08:19 PM, Sam Ruby wrote:
>> 2) https://url.spec.whatwg.org/#concept-basic-url-parser
>>     I'm interpreting "terminate this algorithm" and "return failure" to
>>     mean the same thing, and I'm interpreting "parse error" as "set
>>     parse error flag and continue".
>
> I'm inclined to submit a pull request standardizing on "terminate this
> algorithm" and "set parse error".

I'm not sure what you mean here. Returning failure is important.
However, in override mode returning failure is not important so the
algorithm is simply terminated. And parse error error would be more
like flag a parse error, or append a parse error to a list of parse
errors. It depends a bit on whether the parser decides to halt on the
first one or not.


>> 3) https://url.spec.whatwg.org/#authority-state
>>     a) Did you really mean prepend in Step 1.1?

Yes.


>>     b) Step 1.3.3 seems problematic.  I interpret this prose to mean "if
>>        any character in buffer is a "%" and the first two characters
>>        after the pointer position in input aren't hex characters".
>>        Specifically, it appears to be comparing a possibly
>>        non-contiguous set of characters.

Ah yes. It needs to check the two code points after code point in
buffer. That seems like a bug.


>> 4) https://url.spec.whatwg.org/#file-host-state
>>     Step 1.3.2 returns failure without setting parse_error.  Is this
>>     correct?
>>
>> 5) https://url.spec.whatwg.org/#host-state
>>     Step 1.2.2 also returns failure without setting parse_error.

This is indeed inconsistent. I must at some point have thought that
returning failure without reporting a parse error was fine (as failure
was indicated) or the other way around. Reporting a parse error before
returning is probably best.


> I'm inclined to submit a pull request changing these to to set parse error
> before failing.

Thanks.


>> 6) https://url.spec.whatwg.org/#relative-path-state
>>     If input contains a path but no query or fragment, the last part of
>>     the path will be accumulated into buffer, but that buffer will never
>>     be added to the path
>
> Looks like I got confused by the prose in the spec.  I've submitted a pull
> request that makes this point clearer:
>
>   https://github.com/whatwg/url/pull/4
>
> There were a number of places where things weren't clear to me; after I
> complete my technical review and testing verification, I'll go back and
> identify more.  Meanwhile, here is an example:
>
>   https://github.com/whatwg/url/pull/5

Feel free to only modify url.src.html in PRs.


Did you have a look at the open bugs by the way? There's a chance the
parsing algorithm will get rewritten at some point to be a bit more
functional and less state driven.


-- 
https://annevankesteren.nl/

Received on Sunday, 12 October 2014 08:18:54 UTC