Re: [w3c/manifest] Add id member to manifest (#988) from Matt Giuca on 2021-07-16 (public-webapps-github@w3.org from July 2021)

From: Matt Giuca <notifications@github.com>
Date: Fri, 16 Jul 2021 00:18:23 -0700
To: w3c/manifest <manifest@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3c/manifest/pull/988/review/708027905@github.com>
@mgiuca requested changes on this pull request.

Very good, overall. Everything's in the right place and largely conveys the meaning I think you intended. So don't be alarmed by the high volume of comments. These are details :)

Now I've made a lot of specific phrasing suggestions. Feel free to reword them. They are by no means perfect or canonical, just the way that I would personally phrase things. Ask if you have questions.

Thanks, it's exciting to see this finally coming together!

> @@ -201,6 +201,8 @@ <h2>
         </li>
         <li>[=manifest/theme_color=]
         </li>
+        <li>[=manifest/id=]

Keep this list in alphabetical order (after `icons`).

> @@ -808,6 +811,53 @@ <h3>
           </p>
         </section>
       </section>
+      <section>
+        <h3>
+          `id` member
+        </h3>
+        <p>
+          The [=manifest's=] <code><dfn data-export="" data-dfn-for=
+          "manifest">id</dfn></code> member is a <a>string</a> that
+          represents the <dfn>id</dfn> for the application.

I would rename this definition from "id" to "identity". (That is, the member itself stays as `id`, but the concept that we'll define here and use throughout, is called the "identity".)

Everywhere you say "`|id|`" you would update it to "`|identity|`".

Rationale: Prefer unabbreviated words as the name of concepts in specs (similar to how we always say "application" rather than "app").

> @@ -808,6 +811,53 @@ <h3>
           </p>
         </section>
       </section>
+      <section>
+        <h3>
+          `id` member
+        </h3>
+        <p>
+          The [=manifest's=] <code><dfn data-export="" data-dfn-for=
+          "manifest">id</dfn></code> member is a <a>string</a> that
+          represents the <dfn>id</dfn> for the application.
+        </p>
+        <p>The |id| is used  by user agents uniquely identify the application

nit: Two spaces between "used" and "by".
nit: "used by user agents **to** uniquely..."

> @@ -808,6 +811,53 @@ <h3>
           </p>
         </section>
       </section>
+      <section>
+        <h3>
+          `id` member
+        </h3>
+        <p>
+          The [=manifest's=] <code><dfn data-export="" data-dfn-for=
+          "manifest">id</dfn></code> member is a <a>string</a> that
+          represents the <dfn>id</dfn> for the application.
+        </p>
+        <p>The |id| is used  by user agents uniquely identify the application
+          globally. When |id| is changed, it's recognized as a new app and

"When **an application's** |id| (or |identity|) is changed..."

Actually, I'd probably edit this a bit further:

"When the user agent sees a manifest with an |identity| that does not correspond to an already-installed application, it SHOULD treat that manifest as a description of a distinct application, even if it is served from the same URL as that of another application."

(What do you think of this description? Does that match how you think of identity?)

> @@ -808,6 +811,53 @@ <h3>
           </p>
         </section>
       </section>
+      <section>
+        <h3>
+          `id` member
+        </h3>
+        <p>
+          The [=manifest's=] <code><dfn data-export="" data-dfn-for=
+          "manifest">id</dfn></code> member is a <a>string</a> that
+          represents the <dfn>id</dfn> for the application.
+        </p>
+        <p>The |id| is used  by user agents uniquely identify the application
+          globally. When |id| is changed, it's recognized as a new app and
+          can be installed separately. When id is not changed, it can be used

nit: Linkify |id| (or |identity|).

This sentence is a bit hard to understand, since "not changed" isn't a thing that triggers the UA to perform a search. How about:

"When the user agent sees a manifest with an |identity| matching that of an already-installed application, it SHOULD be used as a signal that this manifest is a replacement for the already-installed application's manifest, and not a distinct application, even if it is served from a different URL than the one seen previously."

(Note: All of my suggested sentences are just suggestions; feel free to reword them as you like.)

> +      <section>
+        <h3>
+          `id` member
+        </h3>
+        <p>
+          The [=manifest's=] <code><dfn data-export="" data-dfn-for=
+          "manifest">id</dfn></code> member is a <a>string</a> that
+          represents the <dfn>id</dfn> for the application.
+        </p>
+        <p>The |id| is used  by user agents uniquely identify the application
+          globally. When |id| is changed, it's recognized as a new app and
+          can be installed separately. When id is not changed, it can be used
+          to find installed application to perform manifest update.
+        </p>
+        <p>
+          The |id| can also be used by third party entities such as PWA stores

nit: "third-party"; however, drop this word because we are a spec, there is no "first party" and "third party". There are only conforming implementations of the standard. A PWA store would be a conforming implementation.

nit: replace "can" with "MAY", a [word with formal spec meaning](https://datatracker.ietf.org/doc/html/rfc2119)

We do not use the term "PWA" in specs; it's more of a marketing term than a technical concept.

So you can reword this to just:

"The |identity| MAY be used by a service that collects lists of web applications to uniquely identify applications."

> +          globally. When |id| is changed, it's recognized as a new app and
+          can be installed separately. When id is not changed, it can be used
+          to find installed application to perform manifest update.
+        </p>
+        <p>
+          The |id| can also be used by third party entities such as PWA stores
+          to uniquely identify and reference web apps
+        </p>
+        <p>
+          To <dfn>process the `id` member</dfn>, given [=object=]
+          |json:JSON|, [=ordered map=] |manifest:ordered map|.:
+        </p>
+        <ol class="algorithm">
+          <li>If the type of |json|["id"] is not [=string=], return.
+          </li>
+          <li>Let |id:URL| be the result of [=URL Parser|parsing=]

Concatenating strings to build URLs is not a good practice. (e.g. it may allow for malicious `id` to escape the URL, though in this case I don't quite see how, in principle we should use the parser infrastructure rather than string concatenations).

nit: Use "_manifest_["start_url"]" to get the start URL, you can't just say "start URL" since that "variable" isn't defined in this "function".
nit: Link the concept [origin](https://url.spec.whatwg.org/#concept-url-origin).

So you would do this by going:

* Let _id_ be the result of [parsing](https://url.spec.whatwg.org/#concept-url-parser) _json_["id"] with _manifest_["start_url"]'s [origin](https://url.spec.whatwg.org/#concept-url-origin) as the base URL.

Note that this subtly changes the semantics of what we're doing here: this allows you to escape the origin by supplying an absolute URL as the `id`. So you need to add a same origin check, after the failure check:

* If _id_ is not [same origin](https://html.spec.whatwg.org/multipage/origin.html#same-origin) as _manifest_["start_url"], return.

Despite having to do that extra check, I think this is better semantics -- in the spirit of the `id` being a URL (not just a string). It means that if you supply an absolute URL that is same origin, it does what you expect. And if you supply an absolute URL that is different origin, it fails. Whereas your previous approach just tacks it on.

To summarize the difference between your code and my proposal (assuming start URL is `https://example.com/index`):

| `id` | Your code | My proposal |
| -- | -- | -- |
| `foo` | `https://example.com/foo` | `https://example.com/foo` |
| `/foo` | `https://example.com/foo` | `https://example.com/foo` |
| `https://example.com/foo` | `https://example.com/https%3A/example.com/foo` | `https://example.com/foo` |
| `https://other.org/foo` | `https://example.com/https%3A/other.org/foo` | `https://example.com/index` |

Of course, this subtle change means that Chrome's implementation needs to change to match too. But I think my proposal makes more sense - at least to me. What do you think?

> +          The |id| can also be used by third party entities such as PWA stores
+          to uniquely identify and reference web apps
+        </p>
+        <p>
+          To <dfn>process the `id` member</dfn>, given [=object=]
+          |json:JSON|, [=ordered map=] |manifest:ordered map|.:
+        </p>
+        <ol class="algorithm">
+          <li>If the type of |json|["id"] is not [=string=], return.
+          </li>
+          <li>Let |id:URL| be the result of [=URL Parser|parsing=]
+          (<var>start URL's origin</var> + "/" + |json|["id"]).
+          </li>
+          <li>If |id| is failure, return.
+          </li>
+          <li>Otherwise, set |manifest|["id"] to |start URL|.

This logic is wrong. This is the success condition: you need to set _manifest_["id"] to _id_ (not start URL).

But you want _manifest_["id"] to be set to start URL if `id` is invalid or missing. Easiest way to do that (since there are lots of early-return statements) is to just "Set _manifest_["id"] to _manifest_["start_url"]" as the first step of the algorithm. Then if you early-return at any point, it will take its default value. If you get all the way to the end, it will be updated with _id_. (This is the same technique used in the "[process the `scope` member](https://www.w3.org/TR/appmanifest/#dfn-process-the-scope-member)" algorithm.)

> +          |json:JSON|, [=ordered map=] |manifest:ordered map|.:
+        </p>
+        <ol class="algorithm">
+          <li>If the type of |json|["id"] is not [=string=], return.
+          </li>
+          <li>Let |id:URL| be the result of [=URL Parser|parsing=]
+          (<var>start URL's origin</var> + "/" + |json|["id"]).
+          </li>
+          <li>If |id| is failure, return.
+          </li>
+          <li>Otherwise, set |manifest|["id"] to |start URL|.
+          </li>
+        </ol>
+        <aside class="example">
+          <p>
+            For example, if the |start URL| is

Maybe this could be replaced with a table, similar to the one I provided in my comment above? It would be easier to read, and you could provide a few more cases.

> +          </li>
+          <li>Let |id:URL| be the result of [=URL Parser|parsing=]
+          (<var>start URL's origin</var> + "/" + |json|["id"]).
+          </li>
+          <li>If |id| is failure, return.
+          </li>
+          <li>Otherwise, set |manifest|["id"] to |start URL|.
+          </li>
+        </ol>
+        <aside class="example">
+          <p>
+            For example, if the |start URL| is
+            <samp>https://example.com/resources/start</samp>,
+            and the value of [=manifest/id=] is <samp>"1"</samp>, then the
+            processed result is <samp>https://example.com/1</samp>. If the
+            value of [=manifest/id=] is <samp>""</samp>, then the processed

"is "" **or "/"**"

> +          </li>
+          <li>If |id| is failure, return.
+          </li>
+          <li>Otherwise, set |manifest|["id"] to |start URL|.
+          </li>
+        </ol>
+        <aside class="example">
+          <p>
+            For example, if the |start URL| is
+            <samp>https://example.com/resources/start</samp>,
+            and the value of [=manifest/id=] is <samp>"1"</samp>, then the
+            processed result is <samp>https://example.com/1</samp>. If the
+            value of [=manifest/id=] is <samp>""</samp>, then the processed
+            result is <samp>https://example.com/</samp>. If [=manifest/id=]
+            not specified, then the processed result is
+            <samp>https://example.com/start</samp>.

This should be `https://example.com/resources/start`.


> @@ -1209,6 +1262,13 @@ <h3>
             Updating the manifest
           </h3>
           <aside class="issue" data-number="446"></aside>
+          <p>
+            User agents MAY re-fetch and update a manifest for an installed web

"MAY **, at any time,**"

Just to be clear that there is no specific circumstance that should trigger the update. It's allowed to happen whenever the UA wants.

> @@ -1209,6 +1262,13 @@ <h3>
             Updating the manifest
           </h3>
           <aside class="issue" data-number="446"></aside>
+          <p>
+            User agents MAY re-fetch and update a manifest for an installed web
+            application and apply updated manifest to its <a>application

This is a bit weird, mechanically. An "application context" is a transient thing (it corresponds to a browser window). An installed web application can have zero or more application contexts at any time, so there is no "its application context".

Also, "application context" doesn't technically belong to an "installed web application"; it is a browsing context that has a manifest applied to it. But the manifest is the thing that's being replaced. So let's see if we can get away with just assuming application contexts belong to installed web applications? ¯\\_(ツ)_/¯

"and apply the updated manifest to any current and future _application contexts_ associated with the application."

> @@ -1209,6 +1262,13 @@ <h3>
             Updating the manifest
           </h3>
           <aside class="issue" data-number="446"></aside>
+          <p>
+            User agents MAY re-fetch and update a manifest for an installed web
+            application and apply updated manifest to its <a>application
+             context</a>. When a manifest is processed, user agents can

Comparing it to what?

Also: "can determine" is non-committal as to what level of requirement this is for user agents. In general, avoid using the word "can" since it isn't an [RFC2119 term](https://datatracker.ietf.org/doc/html/rfc2119) and therefore has no formal meaning. Use "MAY", "MUST", etc.

I think MUST is appropriate here, since the whole point of id is that we don't want to update an app with a manifest with a different ID, even if the manifest has the same URL as before. However, that would be the only MUST that we're introducing in this PR, which may mean we're requiring UAs to change their existing behaviour. So let's start with a MUST and then run it by other browser vendors.

Perhaps rephrase as: "When the user agent fetches a manifest as part of an update check, it MUST compare the fetched processed manifest's `id` with the _identity_ of the app being updated. If it is not equal, the user agent MUST NOT update the installed application's manifest."

If necessary, we could weaken that to a "SHOULD" and "SHOULD NOT", which would mean existing browsers are still compliant without having to change their behaviour.

> @@ -1209,6 +1262,13 @@ <h3>
             Updating the manifest
           </h3>
           <aside class="issue" data-number="446"></aside>
+          <p>
+            User agents MAY re-fetch and update a manifest for an installed web
+            application and apply updated manifest to its <a>application
+             context</a>. When a manifest is processed, user agents can
+            determine whether it's an installed application that can be updated
+            or a new application by comparing application's [=manifest/id=].

Yeah I think leaving it intentionally vague is good. We don't want to impose any details on _how_ the update should work.

> @@ -808,6 +811,53 @@ <h3>
           </p>
         </section>
       </section>
+      <section>
+        <h3>
+          `id` member
+        </h3>
+        <p>
+          The [=manifest's=] <code><dfn data-export="" data-dfn-for=
+          "manifest">id</dfn></code> member is a <a>string</a> that
+          represents the <dfn>id</dfn> for the application.

Add "The _identity_ takes the form of a URL, which is _same origin_ as the _start URL_."

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/manifest/pull/988#pullrequestreview-708027905
Received on Friday, 16 July 2021 07:18:38 UTC