Re: [whatwg/url] Restore formal grammar (#416)

GPHemsley requested changes on this pull request.

This is fine as a point of discussion, but I don't think it's yet ready to be included in the specification proper.

> @@ -2346,6 +2347,105 @@ string <var>input</var>, optionally with a <a>base URL</a> <var>base</var>, opti
 </ol>
 
 
+<h4 id=grammar>Formal grammar for minimally acceptable URLs</h4>
+
+<p>It is the intention that the above parsing algorithm accepts a
+superset of the language defined by the following grammar.  The
+grammar is only defined for UTF-8.  The <var>encoding override</var>
+option is not supported.  Resolving the URL against a <a>base URL</a>
+or raising an error if the URL is relative should be done after
+parsing.</p>
+

All text within the specification is to be considered as normative. However, this sounds more like a note to me than a normative requirement.

> @@ -2346,6 +2347,105 @@ string <var>input</var>, optionally with a <a>base URL</a> <var>base</var>, opti
 </ol>
 
 
+<h4 id=grammar>Formal grammar for minimally acceptable URLs</h4>
+
+<p>It is the intention that the above parsing algorithm accepts a
+superset of the language defined by the following grammar.  The
+grammar is only defined for UTF-8.  The <var>encoding override</var>
+option is not supported.  Resolving the URL against a <a>base URL</a>
+or raising an error if the URL is relative should be done after
+parsing.</p>
+
+<p>This specification uses the Augmented Backus-Naur Form (ABNF)
+notation of STD 68 [[!STD68]], including the following core ABNF
+syntax rules defined by that specification: ALPHA (letters), CR
+(carriage return), DIGIT (decimal digits), DQUOTE (double quote),
+HEXDIG (hexadecimal digits), LF (line feed), and SP (space).</p>
+

I think it's generally preferred that terms defined in other standards are linked to their definitions in those other standards.

> +
+   pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
+
+   query         = *( pchar / "/" / "?" )
+
+   fragment      = *( pchar / "/" / "?" )
+
+   pct-encoded   = "%" HEXDIG HEXDIG
+
+   unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
+   reserved      = gen-delims / sub-delims
+   gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
+   sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
+                 / "*" / "+" / "," / ";" / "="
+</pre>
+

I think more needs to be done here to make this useful than to just drop a huge chunk of preformatted text into the spec. Also, I think wpt would need to be updated to somehow test this grammar as distinct from the parser algorithm.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/pull/416#pullrequestreview-156197717

Received on Tuesday, 18 September 2018 04:09:23 UTC