Re: [w3c/FileAPI] Rewrite a lot of the internals of Blobs to make reading more explicit. (#154)

@pwnall commented on this pull request.

Posting my current comments, since @inexorabletash made it to the end before me. Happy to take a second look after revisions, if you'd like me to.

>  
-2. Set |value|'s underlying byte sequence to |serialized|.\[[ByteSequence]].
+1. Set |value|.[=[[data]]=] to |serialized|.\[[BlobData]].
+
+1. Set |value|.[=[[type]]=] to |serialized|.\[[Type]].
+
+</div>
+
+Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of

The storage API used to persist |serialized| needs a way to

>  
-2. Set |value|'s underlying byte sequence to |serialized|.\[[ByteSequence]].
+1. Set |value|.[=[[data]]=] to |serialized|.\[[BlobData]].
+
+1. Set |value|.[=[[type]]=] to |serialized|.\[[Type]].
+
+</div>
+
+Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of
+modifying the read algorithms for deserialized blobs. I.e. a Blob that was

`I.e.` -> `E.g.,` or `For example`,

(I think the IndexedDB sentence here is an example of the above, not an elaboration.)

>  
-1. Set |serialized|.\[[SnapshotState]] to |value|'s [=snapshot state=].
+A {{Blob}} has an associated <dfn export for=Blob>\[[data]]</dfn> internal slot,
+a [=blob data description=].

internal slot, which holds/is a [=blob data description=] ?

(I'm trying to avoid a reparse. Without the extra words, my first parse of the sentence is that there is an internal slot and a blob data description.)

>  
-2. Set |value|'s underlying byte sequence to |serialized|.\[[ByteSequence]].
+1. Set |value|.[=[[data]]=] to |serialized|.\[[BlobData]].
+
+1. Set |value|.[=[[type]]=] to |serialized|.\[[Type]].
+
+</div>
+
+Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of
+modifying the read algorithms for deserialized blobs. I.e. a Blob that was
+deserialized from IndexedDB should start throwing in its read steps after
+clear-site-data clears all IndexedDB data. Somehow let StructuredDeserialize

`Somehow` is redundant

>  
-2. Set |value|'s underlying byte sequence to |serialized|.\[[ByteSequence]].
+1. Set |value|.[=[[data]]=] to |serialized|.\[[BlobData]].
+
+1. Set |value|.[=[[type]]=] to |serialized|.\[[Type]].
+
+</div>
+
+Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of
+modifying the read algorithms for deserialized blobs. I.e. a Blob that was
+deserialized from IndexedDB should start throwing in its read steps after
+clear-site-data clears all IndexedDB data. Somehow let StructuredDeserialize
+pass along a hook from the storage API to here?

`?` -> `.`

> @@ -269,88 +308,146 @@ which runs these steps:
 1. Let |stream| be the result of [=construct a ReadableStream object|constructing=] a
    {{ReadableStream}} object.
 1. Run the following steps [=in parallel=]:
-  1. While not all bytes of |blob| have been read:
-    1. Let |bytes| be the [=byte sequence=] that results from reading a [=chunk=] from |blob|.
-    1. If a [=file read error=] occured while reading |bytes|, [$ReadableStream/error$]
-       |stream| with a [=failure reason=] and abort these steps.
+  1. Let |blob data| be |blob|.[=[[data]]=].
+  1. Let |read state| be the result of calling |blob data|'s [=read initialization algorithm=]
+    given |blob data|'s [=snapshot state=] and 0.
+    If that threw an exception, [$ReadableStream/error$] |stream| with that exception
+    and abort these steps.
+  1. While true:
+    1. Let |read result| be the result of calling |blob data|'s [=read algorithm=] given |read state|.
+    1. If that threw an exception (a [=file read error=]),

`that` -> `reading` or `the previous step` ?

>  
-    Issue: We need to specify more concretely what reading from a Blob actually does,
+    Issue(144): We need to specify more concretely what reading from a Blob actually does,

`We need to` and `actually` seem redundant

>      1. [=ReadableStream/Enqueue=] a `Uint8Array` object wrapping an `ArrayBuffer`
-       containing |bytes| into |stream|.
-       If that threw an exception, [$ReadableStream/error$] |stream| with that exception
-       and abort these steps.
+      containing |read result| into |stream|.
+      If that threw an exception, [$ReadableStream/error$] |stream| with that exception

Erroring the stream was a separate step above, but it's merged into the same step here. This isn't necessarily a problem, I'm pointing it out to ensure the inconsistency is intentional.

>  
 </div>
 
-### Constructor Parameters ### {#constructorParams}
+Conceptually a {{Blob}} represents a snapshot of some amount of data, and

`some amount of data` -> `a sequence of bytes`

My understanding is that Blobs may use different representations, but the data ultimately resolves to a sequence of bytes.

>  
-<div algorithm="blob-constructor">
-The {{Blob()}} constructor can be invoked with zero or more parameters.
-When the {{Blob()}} constructor is invoked,
-user agents must run the following steps:
+The data represented by a {{Blob}} is described by a <dfn export>blob data description</dfn>,
+consisting of some representation of the data, combined with a set of algorithms to

`some representation of the data` -> `a representation of a sequence of bytes` ?

>  
-1. Let |bytes| be the result of [=processing blob parts=] given {{blobParts}} and {{Blob/Blob(blobParts, options)/options}}.
+A [=blob data description=] has an associated <dfn export for="blob data">snapshot state</dfn>.
+This is a [=map=] that represents the data stored in the blob.

Are we using a map here because the snapshot state uses different representations for byte sequence-backed blobs vs multipart blobs?

> -
-  <dt id="dfn-BlobPropertyBagMembers">An *optional* {{BlobPropertyBag}}
-  <dd>which takes these optional members:
-    * <dfn id="dfn-BPtype" dict-member for="BlobPropertyBag">type</dfn>,
-      the ASCII-encoded string in lower case representing the media type of the {{Blob}}.
-      Normative conditions for this member are provided in the [[#constructorBlob]].
-    * <dfn dict-member for="BlobPropertyBag">endings</dfn>,
-      an enum which can take the values {{"transparent"}} or {{"native"}}.
-      By default this is set to {{"transparent"}}. If set to {{"native"}},
-      [=convert line endings to native|line endings will be converted to native=]
-      in any {{USVString}} elements in {{blobParts}}.
-</dl>
+### Byte Sequence backed blobs ### {#byte-sequence-backed-blobs}
+
+The [=snapshot state=] for a byte sequence backed blob contains a
+<dfn for="bytes blob snapshot state"><code>"data"</code></dfn> member, a [=byte sequence=].

(Disclaimer: I'm not versed in spec-ese) I would have expected `(map) entry` instead of `member` here.

`The [=snapshot state=] ... has a "data" entry that stores a [=byte sequence=].` ?

> +
+<div algorithm="create bytes blob">
+To <dfn lt="creating blob data from bytes">create blob data from bytes</dfn>
+given |bytes| (a [=byte sequence=]), run the following steps:
+
+1. Let |blob data| be a new [=blob data description=].
+1. Set |blob data|.[=snapshot state=][<code><a for="bytes blob snapshot state">"data"</a></code>] to |bytes|.
+
+1. Set |blob data|.[=read initialization algorithm=] to the [=bytes blob read initialization steps=].
+1. Set |blob data|.[=read algorithm=] to the [=bytes blob read steps=].
+
+1. Return |blob data|.
+
+</div>
+
+A <dfn>bytes blob read state</dfn> is a [=struct=] conssting of

`conssting` -> `consisting` (typo)

or `a [=struct=] that has a single <dfn for="bytes blob read state">bytes</dfn> member, which is a [=byte sequence=]`

> +given |read state| (a [=bytes blob read state=]) are:
+
+1. Let |result| be |read state|.[=bytes blob read state/bytes=].
+1. Set |read state|.[=bytes blob read state/bytes=] to an empty [=byte sequence=].
+1. If |result| is empty, return [=end of blob=].
+1. Return |result|.
+
+</div>
+
+### Multipart blobs ### {#multipart-blobs}
+
+Blobs created by the {{Blob}} and {{File}} constructors are made up of multiple parts,
+where each part could be a blob itself.
+
+The [=snapshot state=] for a multipart blob contains a
+<dfn for="multipart blob snapshot state"><code>"parts"</code></dfn> member,

same comment as above regarding `member` vs `(map) entry`

>  
+1. For each |element| in |blobParts|:
   1. If |element| is a {{USVString}}, run the following substeps:
 
     1. Let |s| be |element|.

(not introduced by this PR)

I think it would be nice to replace `|s|` with `|str|`. This would avoid [=UTF-8 encoding=] |s| which reads like it could be a possessive pronoun.

> -
-  <dt id="dfn-BlobPropertyBagMembers">An *optional* {{BlobPropertyBag}}
-  <dd>which takes these optional members:
-    * <dfn id="dfn-BPtype" dict-member for="BlobPropertyBag">type</dfn>,
-      the ASCII-encoded string in lower case representing the media type of the {{Blob}}.
-      Normative conditions for this member are provided in the [[#constructorBlob]].
-    * <dfn dict-member for="BlobPropertyBag">endings</dfn>,
-      an enum which can take the values {{"transparent"}} or {{"native"}}.
-      By default this is set to {{"transparent"}}. If set to {{"native"}},
-      [=convert line endings to native|line endings will be converted to native=]
-      in any {{USVString}} elements in {{blobParts}}.
-</dl>
+### Byte Sequence backed blobs ### {#byte-sequence-backed-blobs}
+
+The [=snapshot state=] for a byte sequence backed blob contains a
+<dfn for="bytes blob snapshot state"><code>"data"</code></dfn> member, a [=byte sequence=].

WDYT about renaming the `"data"` map entry to `"bytes"` to avoid the naming conflict with the `[[data]]` internal slot?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/FileAPI/pull/154#pullrequestreview-417324644

Received on Thursday, 28 May 2020 00:14:26 UTC