- From: Victor Costan <notifications@github.com>
- Date: Wed, 27 May 2020 17:14:13 -0700
- To: w3c/FileAPI <FileAPI@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <w3c/FileAPI/pull/154/review/417324644@github.com>
@pwnall commented on this pull request. Posting my current comments, since @inexorabletash made it to the end before me. Happy to take a second look after revisions, if you'd like me to. > -2. Set |value|'s underlying byte sequence to |serialized|.\[[ByteSequence]]. +1. Set |value|.[=[[data]]=] to |serialized|.\[[BlobData]]. + +1. Set |value|.[=[[type]]=] to |serialized|.\[[Type]]. + +</div> + +Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of The storage API used to persist |serialized| needs a way to > -2. Set |value|'s underlying byte sequence to |serialized|.\[[ByteSequence]]. +1. Set |value|.[=[[data]]=] to |serialized|.\[[BlobData]]. + +1. Set |value|.[=[[type]]=] to |serialized|.\[[Type]]. + +</div> + +Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of +modifying the read algorithms for deserialized blobs. I.e. a Blob that was `I.e.` -> `E.g.,` or `For example`, (I think the IndexedDB sentence here is an example of the above, not an elaboration.) > -1. Set |serialized|.\[[SnapshotState]] to |value|'s [=snapshot state=]. +A {{Blob}} has an associated <dfn export for=Blob>\[[data]]</dfn> internal slot, +a [=blob data description=]. internal slot, which holds/is a [=blob data description=] ? (I'm trying to avoid a reparse. Without the extra words, my first parse of the sentence is that there is an internal slot and a blob data description.) > -2. Set |value|'s underlying byte sequence to |serialized|.\[[ByteSequence]]. +1. Set |value|.[=[[data]]=] to |serialized|.\[[BlobData]]. + +1. Set |value|.[=[[type]]=] to |serialized|.\[[Type]]. + +</div> + +Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of +modifying the read algorithms for deserialized blobs. I.e. a Blob that was +deserialized from IndexedDB should start throwing in its read steps after +clear-site-data clears all IndexedDB data. Somehow let StructuredDeserialize `Somehow` is redundant > -2. Set |value|'s underlying byte sequence to |serialized|.\[[ByteSequence]]. +1. Set |value|.[=[[data]]=] to |serialized|.\[[BlobData]]. + +1. Set |value|.[=[[type]]=] to |serialized|.\[[Type]]. + +</div> + +Issue(w3c/webappsec-clear-site-data#49): The actual storage API |serialized| was persisted in will need a way of +modifying the read algorithms for deserialized blobs. I.e. a Blob that was +deserialized from IndexedDB should start throwing in its read steps after +clear-site-data clears all IndexedDB data. Somehow let StructuredDeserialize +pass along a hook from the storage API to here? `?` -> `.` > @@ -269,88 +308,146 @@ which runs these steps: 1. Let |stream| be the result of [=construct a ReadableStream object|constructing=] a {{ReadableStream}} object. 1. Run the following steps [=in parallel=]: - 1. While not all bytes of |blob| have been read: - 1. Let |bytes| be the [=byte sequence=] that results from reading a [=chunk=] from |blob|. - 1. If a [=file read error=] occured while reading |bytes|, [$ReadableStream/error$] - |stream| with a [=failure reason=] and abort these steps. + 1. Let |blob data| be |blob|.[=[[data]]=]. + 1. Let |read state| be the result of calling |blob data|'s [=read initialization algorithm=] + given |blob data|'s [=snapshot state=] and 0. + If that threw an exception, [$ReadableStream/error$] |stream| with that exception + and abort these steps. + 1. While true: + 1. Let |read result| be the result of calling |blob data|'s [=read algorithm=] given |read state|. + 1. If that threw an exception (a [=file read error=]), `that` -> `reading` or `the previous step` ? > - Issue: We need to specify more concretely what reading from a Blob actually does, + Issue(144): We need to specify more concretely what reading from a Blob actually does, `We need to` and `actually` seem redundant > 1. [=ReadableStream/Enqueue=] a `Uint8Array` object wrapping an `ArrayBuffer` - containing |bytes| into |stream|. - If that threw an exception, [$ReadableStream/error$] |stream| with that exception - and abort these steps. + containing |read result| into |stream|. + If that threw an exception, [$ReadableStream/error$] |stream| with that exception Erroring the stream was a separate step above, but it's merged into the same step here. This isn't necessarily a problem, I'm pointing it out to ensure the inconsistency is intentional. > </div> -### Constructor Parameters ### {#constructorParams} +Conceptually a {{Blob}} represents a snapshot of some amount of data, and `some amount of data` -> `a sequence of bytes` My understanding is that Blobs may use different representations, but the data ultimately resolves to a sequence of bytes. > -<div algorithm="blob-constructor"> -The {{Blob()}} constructor can be invoked with zero or more parameters. -When the {{Blob()}} constructor is invoked, -user agents must run the following steps: +The data represented by a {{Blob}} is described by a <dfn export>blob data description</dfn>, +consisting of some representation of the data, combined with a set of algorithms to `some representation of the data` -> `a representation of a sequence of bytes` ? > -1. Let |bytes| be the result of [=processing blob parts=] given {{blobParts}} and {{Blob/Blob(blobParts, options)/options}}. +A [=blob data description=] has an associated <dfn export for="blob data">snapshot state</dfn>. +This is a [=map=] that represents the data stored in the blob. Are we using a map here because the snapshot state uses different representations for byte sequence-backed blobs vs multipart blobs? > - - <dt id="dfn-BlobPropertyBagMembers">An *optional* {{BlobPropertyBag}} - <dd>which takes these optional members: - * <dfn id="dfn-BPtype" dict-member for="BlobPropertyBag">type</dfn>, - the ASCII-encoded string in lower case representing the media type of the {{Blob}}. - Normative conditions for this member are provided in the [[#constructorBlob]]. - * <dfn dict-member for="BlobPropertyBag">endings</dfn>, - an enum which can take the values {{"transparent"}} or {{"native"}}. - By default this is set to {{"transparent"}}. If set to {{"native"}}, - [=convert line endings to native|line endings will be converted to native=] - in any {{USVString}} elements in {{blobParts}}. -</dl> +### Byte Sequence backed blobs ### {#byte-sequence-backed-blobs} + +The [=snapshot state=] for a byte sequence backed blob contains a +<dfn for="bytes blob snapshot state"><code>"data"</code></dfn> member, a [=byte sequence=]. (Disclaimer: I'm not versed in spec-ese) I would have expected `(map) entry` instead of `member` here. `The [=snapshot state=] ... has a "data" entry that stores a [=byte sequence=].` ? > + +<div algorithm="create bytes blob"> +To <dfn lt="creating blob data from bytes">create blob data from bytes</dfn> +given |bytes| (a [=byte sequence=]), run the following steps: + +1. Let |blob data| be a new [=blob data description=]. +1. Set |blob data|.[=snapshot state=][<code><a for="bytes blob snapshot state">"data"</a></code>] to |bytes|. + +1. Set |blob data|.[=read initialization algorithm=] to the [=bytes blob read initialization steps=]. +1. Set |blob data|.[=read algorithm=] to the [=bytes blob read steps=]. + +1. Return |blob data|. + +</div> + +A <dfn>bytes blob read state</dfn> is a [=struct=] conssting of `conssting` -> `consisting` (typo) or `a [=struct=] that has a single <dfn for="bytes blob read state">bytes</dfn> member, which is a [=byte sequence=]` > +given |read state| (a [=bytes blob read state=]) are: + +1. Let |result| be |read state|.[=bytes blob read state/bytes=]. +1. Set |read state|.[=bytes blob read state/bytes=] to an empty [=byte sequence=]. +1. If |result| is empty, return [=end of blob=]. +1. Return |result|. + +</div> + +### Multipart blobs ### {#multipart-blobs} + +Blobs created by the {{Blob}} and {{File}} constructors are made up of multiple parts, +where each part could be a blob itself. + +The [=snapshot state=] for a multipart blob contains a +<dfn for="multipart blob snapshot state"><code>"parts"</code></dfn> member, same comment as above regarding `member` vs `(map) entry` > +1. For each |element| in |blobParts|: 1. If |element| is a {{USVString}}, run the following substeps: 1. Let |s| be |element|. (not introduced by this PR) I think it would be nice to replace `|s|` with `|str|`. This would avoid [=UTF-8 encoding=] |s| which reads like it could be a possessive pronoun. > - - <dt id="dfn-BlobPropertyBagMembers">An *optional* {{BlobPropertyBag}} - <dd>which takes these optional members: - * <dfn id="dfn-BPtype" dict-member for="BlobPropertyBag">type</dfn>, - the ASCII-encoded string in lower case representing the media type of the {{Blob}}. - Normative conditions for this member are provided in the [[#constructorBlob]]. - * <dfn dict-member for="BlobPropertyBag">endings</dfn>, - an enum which can take the values {{"transparent"}} or {{"native"}}. - By default this is set to {{"transparent"}}. If set to {{"native"}}, - [=convert line endings to native|line endings will be converted to native=] - in any {{USVString}} elements in {{blobParts}}. -</dl> +### Byte Sequence backed blobs ### {#byte-sequence-backed-blobs} + +The [=snapshot state=] for a byte sequence backed blob contains a +<dfn for="bytes blob snapshot state"><code>"data"</code></dfn> member, a [=byte sequence=]. WDYT about renaming the `"data"` map entry to `"bytes"` to avoid the naming conflict with the `[[data]]` internal slot? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/w3c/FileAPI/pull/154#pullrequestreview-417324644
Received on Thursday, 28 May 2020 00:14:26 UTC