Re: [whatwg/storage] Clarify storage infrastructure (#86)

@domenic commented on this pull request.



> +
+<h3 id=storage-units>Storage units</h3>
+
+<p>A <dfn>storage type</dfn> is "<code>storage</code>" or "<code>session-storage</code>".
+
+<p>A <dfn>storage key</dfn> is an <a for=/>origin</a>. [[HTML]]
+
+<p class=XXX>This is expected to change, see
+<a href="https://privacycg.github.io/storage-partitioning/">Client-Side Storage Partitioning</a>.
+
+<p>A <dfn id=site-storage-unit>storage unit</dfn> is the least granular unit of storage, other than
+the user agent. It holds a <dfn for="storage unit">map</dfn>, which is a <a for=/>map</a> of
+<a for=/>ASCII strings</a> to <a>abstract storage buckets</a>.
+
+<p>A <dfn>storage map</dfn> is a <a for=/>map</a> of <a>storage keys</a> to <a>storage units</a>. It
+is initially empty.

I find the maps here really, really confusing.

- One type name is just "map", owned by a storage unit.

- One type name is "storage map", owned by the user agent (but you have to go much further to find that) and a browsing session.

- One type name is just "map" again, owned by an abstract storage bucket.

- One type name is just "map" again, but its an Infra map this time, instead of a separate `<dfn>`. It's owned by an area.

- One type name is "storage proxy map". It seems this is owned by an area?

---

Suggestions:

- Give each distinct types a distinct name. E.g. maybe a storage unit holds a "bucket map"?
- Expand on the meanings of each type using informative prose. E.g. "A storage unit holds a bucket map, which is a map of ASCII strings representing bucket names to abstract storage buckets".
- Give the two "storage map" fields distinct names. E.g. "A user agent holds a persistent storage map, which is a storage map" + "A browsing session holds a session storage map, which is a storage map".
- Reconsider where concepts are defined. The "storage units" section has a lot of concept definitions that do not seem immediately connected to a storage unit.
  - This may necessitate using terms before they are defined. That's OK. In fact, defining the hierarchy top-to-bottom would be the most readable to me. E.g. if the first section was "storage maps" (containing the definitions for "storage key", "storage map", "user agent's storage map", and "browsing session's storage map"), then it would refer to the storage unit concept. I'd want to find out what that is, which I would very naturally do by reading on to the next section. Then that next section would reference "storage bucket", which I'd read on to the next section to learn about... etc.
  - The spec as a whole might benefit from a "Model" section that gives the whole thing from top to bottom. Similar to the hierarchy outlined in https://github.com/whatwg/storage/pull/86#issuecomment-623214780 (but with more detail).

> -<p>This standard primarily concerns itself with <dfn export id=site-storage>storage</dfn>.
+<p>This standard primarily concerns itself with storage.
+
+
+<h3 id=storage-units>Storage units</h3>
+
+<p>A <dfn>storage type</dfn> is "<code>storage</code>" or "<code>session-storage</code>".
+
+<p>A <dfn>storage key</dfn> is an <a for=/>origin</a>. [[HTML]]
+
+<p class=XXX>This is expected to change, see
+<a href="https://privacycg.github.io/storage-partitioning/">Client-Side Storage Partitioning</a>.
+
+<p>A <dfn id=site-storage-unit>storage unit</dfn> is the least granular unit of storage, other than
+the user agent. It holds a <dfn for="storage unit">map</dfn>, which is a <a for=/>map</a> of
+<a for=/>ASCII strings</a> to <a>abstract storage buckets</a>.

Why ASCII?

Are these strings user-defined or UA-defined? Where do they come from in general?

> + <li><p>If <var>key</var> is an <a>opaque origin</a>, then return failure.
+
+ <li><p>If the user has disabled storage, then return failure.
+
+ <li>
+  <p>If <var>map</var>[<var>key</var>] does not <a for=map>exist</a>, then:
+
+  <ol>
+   <li><p>Let <var>unit</var> be a new <a>storage unit</a>.
+
+   <li>
+    <p>Set <var>unit</var>'s <a for="storage unit">map</a>["<code>default</code>"] to the result of
+    <a>create a storage bucket</a> with <var>type</var>.
+
+    <p class="note">For now "<code>default</code>" is all that exists. See
+    <a href="https://github.com/whatwg/storage/issues/2">issue #2</a>.

This note could move to the definition of "bucket map", as part of the general comment above about explaining that concept more when defining it.

> +<a>top-level browsing context</a>, except that it survives the <a>top-level browsing context</a>
+being replaced due to a cross-origin opener policy.
+
+
+<h3 id=storage-endpoints>Storage endpoints</h3>
+
+<p>A <dfn export>storage endpoint</dfn> is a storage or session storage API that uses the
+infrastructure defined by this standard to keep track of its storage needs.
+
+<p>A <a>storage endpoint</a> has an <dfn for="storage endpoint">identifier</dfn>, which is a
+<a>storage identifier</a>.
+
+<p>A <a>storage endpoint</a> also has <dfn for="storage endpoint">types</dfn>, which is a
+<a for=/>set</a> of <a>storage types</a>.
+
+<p>A <dfn>storage identifier</dfn> is an <a for=/>ASCII string</a>.

Why ASCII? Is there something that plans on serializing this into places only ASCII is allowed?

In general, it would be good to state where these strings are used. Are they spec-internal? (I think no; I think they show up in quota APIs?)

> +
+<p class=XXX>Browsing session is yet to be formally defined. For all intents and purposes it is a
+<a>top-level browsing context</a>, except that it survives the <a>top-level browsing context</a>
+being replaced due to a cross-origin opener policy.
+
+
+<h3 id=storage-endpoints>Storage endpoints</h3>
+
+<p>A <dfn export>storage endpoint</dfn> is a storage or session storage API that uses the
+infrastructure defined by this standard to keep track of its storage needs.
+
+<p>A <a>storage endpoint</a> has an <dfn for="storage endpoint">identifier</dfn>, which is a
+<a>storage identifier</a>.
+
+<p>A <a>storage endpoint</a> also has <dfn for="storage endpoint">types</dfn>, which is a
+<a for=/>set</a> of <a>storage types</a>.

Why a set?

>  
-<p>Each <a for=/>origin</a> has an associated <a>storage unit</a>. A <a>storage unit</a> contains a
-single <dfn export id=bucket oldids=box>bucket</dfn>. [[HTML]]
+<p>How an <a>abstract storage bucket</a>'s <a for="abstract storage bucket">area</a>'s
+<a for="abstract storage bucket/area">map</a> is stored is <a>implementation-defined</a>. How it is
+made available across <a>agent</a> or even <a>agent cluster</a> boundaries is
+<a>implementation-defined</a>.

This is a bit confusing. Upon re-reading it's clear, but at first I misread it as "_Whether_ it is made available across agent or event agent cluster boundaries is implementation-defined".

I'd rephrase this section as something like

> An area's map is where the actual data meant to be stored lives. User agents are expected to store this data, and make it available across agents and agent clusters, in an implementation-defined manner, so that when this specification (or other specifications?) access the contents of the map, they are available.

>  
-<h3 id=buckets oldids=boxes>Buckets</h3>
+<p>A <dfn id=bucket oldids=box>storage bucket</dfn> is an <a>abstract storage bucket</a> for storage
+APIs.

This is confusing because "session storage APIs" sounds like a subset of "storage APIs". I think it'd be good to have two distinct terms. (Maybe changing the `"storage"` and `"session-storage"` types as well.)

> +
+ <li><p><a for=set>For each</a> <var>endpoint</var> of <a>registered storage endpoints</a> whose
+ <a for="storage endpoint">types</a> <a for=set>contain</a> <var>type</var>, set <var>bucket</var>'s
+ <a for="storage bucket">map</a>[<var>endpoint</var>'s <a for="storage endpoint">identifier</a>] to
+ a new <a for="storage bucket">area</a>.
+
+ <li><p>Return <var>bucket</var>.
+</ol>
+
+
+<h3 id=storage-proxy-maps>Storage proxy maps</h3>
+
+<p>A <dfn>storage proxy map</dfn> is equivalent to a <a for=/>map</a>, except that all operations
+are instead performed on its <dfn for="storage proxy map">backing map</dfn>.
+
+<p class="note">This allows for the <a for="storage proxy map">backing map</a> to be replaced.

When is replacing the map necessary? It'd be good to state that to motivate this somewhat-strange architecture.

> + <li><p>Let <var>unit</var> be the result of running <a>obtain a storage unit</a>, with
+ <var>map</var>, <var>environment</var>, and <var>type</var>.
+
+ <li><p>If <var>unit</var> is failure, then return failure.
+
+ <li><p>Let <var>bucket</var> be <var>unit</var>'s
+ <a for="storage unit">map</a>["<code>default</code>"].
+
+ <li><p>Let <var>area</var> be <var>bucket</var>'s
+ <a for="storage bucket">map</a>[<var>identifier</var>].
+
+ <li><p>Let <var>proxyMap</var> be a new <a>storage proxy map</a> whose
+ <a for="storage proxy map">backing map</a> is <var>area</var>'s
+ <a for="storage bucket/area">map</a>.
+
+ <li><p>Append a reference to <var>proxyMap</var> to <var>area</var>'s

Link append?

"a reference to" seems redundant and a bit confusing?

> + <a for="storage unit">map</a>["<code>default</code>"].
+
+ <li><p>Let <var>area</var> be <var>bucket</var>'s
+ <a for="storage bucket">map</a>[<var>identifier</var>].
+
+ <li><p>Let <var>proxyMap</var> be a new <a>storage proxy map</a> whose
+ <a for="storage proxy map">backing map</a> is <var>area</var>'s
+ <a for="storage bucket/area">map</a>.
+
+ <li><p>Append a reference to <var>proxyMap</var> to <var>area</var>'s
+ <a for="storage bucket/area">proxy map reference set</a>.
+
+ <li><p>Return <var>proxyMap</var>.
+</ol>
+
+<p>To <dfn export>obtain a storage bucket area map</dfn>, given an

The above-mentioned intro/model section I suggested should prominently point out that these are the entry points specifications will usually use.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/storage/pull/86#pullrequestreview-405120181

Received on Monday, 4 May 2020 16:10:20 UTC