Re: [whatwg/url] Restore formal grammar (#416)

sjamaan commented on this pull request.



> +
+   pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
+
+   query         = *( pchar / "/" / "?" )
+
+   fragment      = *( pchar / "/" / "?" )
+
+   pct-encoded   = "%" HEXDIG HEXDIG
+
+   unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
+   reserved      = gen-delims / sub-delims
+   gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
+   sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
+                 / "*" / "+" / "," / ";" / "="
+</pre>
+

Regarding the preformatted text, I wouldn't mind polishing it to make it formatted in a way that makes it more readable. Would that involve adding BNF tags to bikeshed, or are there suitable options already?

Testing a grammar would be kind of tricky. Usually one would pick a parser generator and make the necessary transformations to make the BNF fit that particular parser. I have several different implementations of the RFC using various parser generators in [my project's repository](https://bugs.call-cc.org/browser/project/release/5/uri-generic/trunk/alternatives) but if you inspect them they all involve some modifications. Funny enough, the [one based on SRE](https://bugs.call-cc.org/browser/project/release/5/uri-generic/trunk/alternatives/uri-generic.irregex.scm#L503) (a composable form of regular expressions) is the only one without *any* changes from the BNF, IIRC.

We could include that SRE-based implementation in WPT. If only JavaScript is accepted as an implementation language I can take a look at the ecosystem to find a good parser generator and try my hand at transcoding the BNF to it. I'm not a good JavaScript coder, though!

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/pull/416#discussion_r218318573

Received on Tuesday, 18 September 2018 07:05:49 UTC