Re: [whatwg/url] Provide a succinct grammar for valid URL strings (#479)

@becarpenter This standard does not contain anything grammar-like for “parsing-valid” URLs. 

So even if someone wanted to do a grammar driven parser, there would not be a grammar to base it on (unless they are willing to use mine).

For context: The parser accepts URL-strings that are not valid according to the standard, leading to the split concepts of _valid_ and _parsing-valid_. 

I have seen the assumption that writing a grammar for _parsing-valid_ URLs could not be done because error correcting could not be expressed within a grammar. 

That is not the case. There are only minor differences between valid URLs and parsing-valid URLs. 
The latter supports a larger base alphabet within the components. Furthermore, with drive-letters, eg. `file://c:/` is parsed as `file:///c:/`, and invalid percent escape sequences and the use of backslashes are tolerated in parsing-valid URLs. The role of backslashes is scheme-dependent though. Credentials are considered invalid, but you could enforce that at the semantic level.  

⚠️ Specials URLs that do not have an authority are also considered invalid URLs, but it is essential to enforce that during resolution, instead of doing so at the grammatical level – this is so because they _can_ be used as a relative reference. 

(Also relevant to #704)

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/479#issuecomment-1239060644

You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/issues/479/1239060644@github.com>

Received on Wednesday, 7 September 2022 08:15:12 UTC