Re: [whatwg/url] Named validation errors (#502)

@domenic commented on this pull request.

+1 for better consistency in naming the errors, if possible. Some suggestions, none of which I'm 100% confident in:

- All the file: URL specific ones could be prefixed with `file-`.
- All the authority/credentails ones could be prefixed with `authority-`.
- Choose between "unexpected" (e.g. unexpected-C0-control-or-space), "invalid" (e.g. invalid-URL-code-point, invalid-scheme-start), and "forbidden" (e.g. forbidden-host-code-point").

I also wonder about redundancy for some of these. Isn't invalid-URL-code-point a superset of unexpected-C0-control-or-space and unexpected-ASCII-tab-or-newline? Similarly isn't unexpected-at-sign a superset of unexpected-credentials-without-host and missing-solidus-before-authority? Maybe the extra specificity is useful, but I'm not so sure...

> @@ -88,6 +88,334 @@ valid input. User agents, especially conformance checkers, are encouraged to rep
  unclear to other developers.
 </div>
 
+<table>
+ <thead>
+  <tr>
+   <th>Error type
+   <th>Error description
+   <th>Failure
+ <tbody>
+  <tr>
+   <th colspan=3 scope=rowgroup><a href=#idna>IDNA</a>
+  <tr>
+   <td><dfn id=validation-error-domain-to-ascii>domain-to-ASCII</dfn>
+   <td>
+    <p>The result of <a abstract-op lt=ToASCII>Unicode toASCII</a> records an error while processing

```suggestion
    <p>The result of <a abstract-op lt=ToASCII>Unicode ToASCII</a> records an error while processing
```

> + <tbody>
+  <tr>
+   <th colspan=3 scope=rowgroup><a href=#idna>IDNA</a>
+  <tr>
+   <td><dfn id=validation-error-domain-to-ascii>domain-to-ASCII</dfn>
+   <td>
+    <p>The result of <a abstract-op lt=ToASCII>Unicode toASCII</a> records an error while processing
+    the input domain.
+    <p class=note>[[!UTS46]] conformance does not require the reporting of precise errors, only that
+    an error has occurred. If the [[!UTS46]] implementation reports precise error codes, user agents
+    are encouraged pass those codes along.
+   <td>✅
+  <tr>
+   <td><dfn>domain-to-ASCII-empty</dfn>
+   <td>
+    <p>The result of <a abstract-op lt=ToASCII>Unicode toASCII</a> returns an empty string. This

```suggestion
    <p>The result of <a abstract-op lt=ToASCII>Unicode ToASCII</a> returns an empty string. This
```

> +   <td>✅
+  <tr>
+   <td><dfn>domain-to-ASCII-empty</dfn>
+   <td>
+    <p>The result of <a abstract-op lt=ToASCII>Unicode toASCII</a> returns an empty string. This
+    could have been caused by:
+    <ul>
+     <li>Input consists of all ignorable code points.
+     <li>Input is the string "<code>xn--</code>".
+     <li>Input is the empty string and the <i>VerifyDnsLength</i> parameter is false.
+    </ul>
+   <td>✅
+  <tr>
+   <td><dfn>domain-to-Unicode</dfn>
+   <td>
+    <p>The result of <a abstract-op lt=ToUnicode>Unicode toUnicode</a> returns an error while

```suggestion
    <p>The result of <a abstract-op lt=ToUnicode>Unicode ToUnicode</a> returns an error while
```

> +    <p>The input to the <a>URL parser</a> contains a leading or trailing <a>C0 control or space</a>. The
+    URL parser subsequently strips any matching code points.
+    <p class=example id=example-unexpected-c0-control-or-space>"<code> https://example.org </code>"
+   <td>❌
+  <tr>
+   <td><dfn>unexpected-ASCII-tab-or-newline</dfn>
+   <td>
+    <p>The input to the URL parser contains <a>ASCII tab or newlines</a>. The URL parser
+    subsequently strips any matching code points.
+    <p class=example id=example-unexpected-ascii-tab-or-newline>"<code>ht<br>tps://example.org</code>"
+   <td>❌
+  <tr>
+   <td><dfn>invalid-scheme-start</dfn>
+   <td>
+    <p>The first code point of a URL's <a for=url>scheme</a> is not an <a>ASCII alpha</a>.
+    <p class=example id=example-invalid-scheme-start>"3ttps://example.org"

```suggestion
    <p class=example id=example-invalid-scheme-start>"<code>3ttps://example.org</code>"
```

> +    <div class=example id=example-missing-scheme-non-relative-url>
+     <p>Input's <a for=url>scheme</a> is missing and no <a>base URL</a> is given:
+     <pre><code class=lang-javascript>
+let url = new URL("💩");</code></pre>
+     <p>Input's <a for=url>scheme</a> is missing, but the <a>base URL</a> has an
+     <a for=url>opaque path</a>.
+     <pre><code class=lang-javascript>
+let url = new URL("💩", "mailto:user@example.org");</code></pre>
+    </div>
+   <td>✅
+  <tr>
+   <td><dfn>relative-URL-missing-beginning-solidus</dfn>
+   <td>
+    <p>The input is a <a>relative-URL String</a> that does not begin with U+002F (/).
+    <pre class=example id=example-relative-url-missing-beginning-solidus><code class="lang-javascript">
+let url = new URL("foo.html", "https://example.org/");</code></pre>

Why the heck would this be an error? Does this mean `<a href="foo.html">` is an error?

> +  <tr>
+   <th>Error type
+   <th>Error description
+   <th>Failure
+ <tbody>
+  <tr>
+   <th colspan=3 scope=rowgroup><a href=#idna>IDNA</a>
+  <tr>
+   <td><dfn id=validation-error-domain-to-ascii>domain-to-ASCII</dfn>
+   <td>
+    <p>The result of <a abstract-op lt=ToASCII>Unicode toASCII</a> records an error while processing
+    the input domain.
+    <p class=note>[[!UTS46]] conformance does not require the reporting of precise errors, only that
+    an error has occurred. If the [[!UTS46]] implementation reports precise error codes, user agents
+    are encouraged pass those codes along.
+   <td>✅

Although it may have been my suggestion, I think the green checkmarks / red Xs are confusing, because they feel too much like "success" / "failure".

Maybe "Yes" / "&middot;" similar to https://html.spec.whatwg.org/#linkTypes would work better.

> +   <td><dfn>unexpected-Windows-drive-letter</dfn>
+   <td>
+    <p>The input is a <a>relative-URL string</a> that <a>starts with a Windows drive letter</a> and
+    the <a>base URL</a>'s <a for=url>scheme</a> is "<code>file</code>".
+    <pre class=example id=example-unexpected-windows-drive-letter><code class=lang-javascript>
+let url = new URL("/c:/path/to/file", "file:///c:/");</code></pre>
+   <td>❌
+  <tr>
+   <td><dfn>unexpected-Windows-drive-letter-host</dfn>
+   <td>
+    <p>The file URL's host is a Windows drive letter.
+    <p class=example id=example-unexpected-windows-drive-letter-host>"<code>file://c:</code>"
+   <td>❌
+ <tbody>
+  <tr>
+   <th colspan=3 scope=rowgroup>URL parsing and <a>opaque-host parser</a>

I don't really understand why this is split out from the above section.

> +    <a>ASCII alpha</a>, and either no <a>base URL</a> was provided or the <a>base URL</a> cannot be
+    used as a <a>base URL</a> because it has an <a for=url>opaque path</a>.
+    <div class=example id=example-missing-scheme-non-relative-url>
+     <p>Input's <a for=url>scheme</a> is missing and no <a>base URL</a> is given:
+     <pre><code class=lang-javascript>
+let url = new URL("💩");</code></pre>
+     <p>Input's <a for=url>scheme</a> is missing, but the <a>base URL</a> has an
+     <a for=url>opaque path</a>.
+     <pre><code class=lang-javascript>
+let url = new URL("💩", "mailto:user@example.org");</code></pre>
+    </div>
+   <td>✅
+  <tr>
+   <td><dfn>relative-URL-missing-beginning-solidus</dfn>
+   <td>
+    <p>The input is a <a>relative-URL String</a> that does not begin with U+002F (/).

```suggestion
    <p>The input is a <a>relative-URL string</a> that does not begin with U+002F (/).
```

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/pull/502#pullrequestreview-1266754409

You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/pull/502/review/1266754409@github.com>

Received on Tuesday, 24 January 2023 04:18:22 UTC