- From: Joshua Bell <notifications@github.com>
- Date: Wed, 27 May 2020 16:29:06 -0700
- To: w3c/FileAPI <FileAPI@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <w3c/FileAPI/pull/154/review/419541114@github.com>
@inexorabletash approved this pull request.
I probably didn't catch everything, but I made it to the end!
>
<div algorithm="process-blob-parts">
-To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn> given a sequence of {{BlobPart}}'s |parts|
-and {{BlobPropertyBag}} |options|,
+To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn>
+given a sequence of {{BlobPart}}'s |blobParts| and {{BlobPropertyBag}} |options|,
Nit: `{{BlobPart}}s` (plural s, not a possessive s)
>
<div algorithm="process-blob-parts">
-To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn> given a sequence of {{BlobPart}}'s |parts|
-and {{BlobPropertyBag}} |options|,
+To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn>
+given a sequence of {{BlobPart}}'s |blobParts| and {{BlobPropertyBag}} |options|,
Also, maybe list rather than sequence? (I'm hazy on when to switch from IDL terms to Infra terms)
>
-1. For each |element| in |parts|:
+1. Let |bytes| be an empty [=byte sequence=].
This algorithm concatenates adjacent non-blob parts rather than just ending up with more parts. Although this probably matches implementations, it doesn't seem to be observable and makes the algorithm more complicated. Is there a good reason for it?
> @@ -361,16 +458,179 @@ run the following steps:
1. If |element| is a {{BufferSource}}, <a lt="get a copy of the buffer source">get
Not new, but would be more readable with "Otherwise, if ..."
> @@ -361,16 +458,179 @@ run the following steps:
1. If |element| is a {{BufferSource}}, <a lt="get a copy of the buffer source">get
a copy of the bytes held by the buffer source</a>, and append those bytes to |bytes|.
- 1. If |element| is a {{Blob}},
- append the bytes it represents to |bytes|.
+ 1. If |element| is a {{Blob}}:
"Otherwise, if ..."
> +1. Set |result|.[=read algorithm=] to the [=multipart blob read steps=].
+
+1. Return |result|.
+
+</div>
+
+A <dfn>multipart blob read state</dfn> is a [=struct=] consisting of:
+
+<dl dfn-for="multipart blob read state">
+: <dfn>parts</dfn>
+:: A [=queue=] of [=blob data descriptions=], representing the not yet read parts of the blob.
+: <dfn>offset</df>
+:: A number, representing the byte offset in the remaining blob parts
+ from which to start returning data.
+: <dfn>nested blob data</dfn>
+:: `undefined` or a [=blob data description=]. This is `undefined` unless otherwise specified.
I don't think we typically style undefined (or null, true, false) as code ?
(Likely not consistent across specs)
> +
+## Constructors ## {#constructorBlob}
+
+
+<div algorithm="blob-constructor">
+The <dfn constructor for=Blob lt="Blob(blobParts, options)|Blob(blobParts)|Blob()"><code>new Blob(|blobParts|, |options|)</code></dfn> constructor steps are:
+
+1. Let |blob data| be the result of [=processing blob parts=] given |blobParts| and |options|.
+1. Set [=this=].[=[[data]]=] to |blob data|.
+
+1. Let |type| be an empty string.
+1. If the {{BlobPropertyBag/type}} member of the {{Blob/Blob(blobParts, options)/options}} argument is not the empty string,
+ run the following sub-steps:
+
+ 1. Let |type| be the {{BlobPropertyBag/type}} dictionary member.
+ If |type| contains any characters outside the range U+0020 to U+007E,
This could be a separate substep
> +
+
+<div algorithm="blob-constructor">
+The <dfn constructor for=Blob lt="Blob(blobParts, options)|Blob(blobParts)|Blob()"><code>new Blob(|blobParts|, |options|)</code></dfn> constructor steps are:
+
+1. Let |blob data| be the result of [=processing blob parts=] given |blobParts| and |options|.
+1. Set [=this=].[=[[data]]=] to |blob data|.
+
+1. Let |type| be an empty string.
+1. If the {{BlobPropertyBag/type}} member of the {{Blob/Blob(blobParts, options)/options}} argument is not the empty string,
+ run the following sub-steps:
+
+ 1. Let |type| be the {{BlobPropertyBag/type}} dictionary member.
+ If |type| contains any characters outside the range U+0020 to U+007E,
+ then set |type| to the empty string and return from these substeps.
+ 1. Convert every character in |type| to [=ASCII lowercase=].
This could be an "Otherwise,..." and then "return from these substeps" wouldn't be needed.
> - Note: Use of the {{Blob/type}} attribute informs the [=package data=] algorithm
- and determines the `Content-Type` header when [=/fetching=] [=blob URLs=].
-</dl>
+</div>
+
+The <dfn attribute for=Blob id="dfn-size">size</dfn> getter steps are to return [=this=].[=[[data]]=].[=blob data/size=].
+
+<div class="note domintro">
+: |blob| . {{Blob/type}}
+:: The ASCII-encoded string in lower case representing the media type of the {{Blob}},
+ or an empty string if the type cannot be determined.
+
+ The {{Blob/type}} attribute can be set by the web application itself through constructor invocation
+ and through the {{Blob/slice()}} call;
+
+ Note: The type of a {{Blob}} is considered a <a>parsable MIME type</a>,
Not new, but grammar here is awkward. Since this is non-normative, it can also be simplified, e.g.:
Note: The type of a Blob is considered a _parsable MIME type_ [no comma] if performing the _parse a MIME type_ algorithm on [not to] the Blob object's type (as an ASCII-encoded byte sequence) does not return failure.
> +
+1. Let |relativeStart| be 0.
+1. If |start| is not `undefined`:
+ 1. If |start| < 0, set |relativeStart| to <code>max([=this=].[=[[data]]=].[=blob data/size=] + |start|, 0)</code>.
+ 1. Otherwise, set |relativeStart| to <code>min(|start|, [=this=].[=[[data]]=].[=blob data/size=])</code>.
+
+1. Let |relativeEnd| be [=this=].[=[[data]]=].[=blob data/size=].
+1. If |end| is not `undefined`:
+ 1. If |end| < 0, set |relativeEnd| to <code>max([=this=].[=[[data]]=].[=blob data/size=] + |end|, 0)</code>.
+ 1. Otherwise, set |relativeEnd| to <code>min(|end|, [=this=].[=[[data]]=].[=blob data/size=])</code>.
+
+1. Let |span| be <code>max((relativeEnd - relativeStart), 0)</code>.
+
+1. Let |relativeContentType| be an empty string.
+1. If |contentType| is not `undefined`:
+ 1. If |contentType| does not contain any characters outside the range of U+0x0020 to U+0x007E:
There are enough occurrences of "if not ASCII, empty string. Otherwise, ASCII-lowercase it" that a separate algorithm seems like a good idea.
> +
+<div algorithm="create a file">
+To <dfn export>create a file backed {{File}} object</dfn> for a given |native file|,
+run these steps:
+
+1. Let |snapshot state| be an empty [=map=].
+1. Set |snapshot state|[<code><a for="file blob snapshot state">"file"</a></code>] to |native file|.
+1. Let |last modified| be the last time |native file| was modified,
+ as the number of milliseconds since the [=Unix Epoch=].
+ If this can't be determined, set |last modified| to the current date and time
+ represented as the number of milliseconds since the [=Unix Epoch=].
+1. Set |snapshot state|[<code><a for="file blob snapshot state">"last modified"</a></code>] to |last modified|.
+
+1. Let |name| be the file name of |native file|, converted to a string in a
+ user agent defined manner.
+1. Let |content type| be the mime type of |native file| (as a lowercase ASCII string),
nit: capitalize MIME
> +<div algorithm="create a file">
+To <dfn export>create a file backed {{File}} object</dfn> for a given |native file|,
+run these steps:
+
+1. Let |snapshot state| be an empty [=map=].
+1. Set |snapshot state|[<code><a for="file blob snapshot state">"file"</a></code>] to |native file|.
+1. Let |last modified| be the last time |native file| was modified,
+ as the number of milliseconds since the [=Unix Epoch=].
+ If this can't be determined, set |last modified| to the current date and time
+ represented as the number of milliseconds since the [=Unix Epoch=].
+1. Set |snapshot state|[<code><a for="file blob snapshot state">"last modified"</a></code>] to |last modified|.
+
+1. Let |name| be the file name of |native file|, converted to a string in a
+ user agent defined manner.
+1. Let |content type| be the mime type of |native file| (as a lowercase ASCII string),
+ derived from |name| in a user agent defined manner, or an empty string if no type
Are UAs allowed to determine MIME type in other ways, e.g. from content sniffing, extended attributes, etc ?
i.e. does the "derived from _name_" need to be part of the steps?
> + If this can't be determined, set |last modified| to the current date and time
+ represented as the number of milliseconds since the [=Unix Epoch=].
+1. Set |snapshot state|[<code><a for="file blob snapshot state">"last modified"</a></code>] to |last modified|.
+
+1. Let |name| be the file name of |native file|, converted to a string in a
+ user agent defined manner.
+1. Let |content type| be the mime type of |native file| (as a lowercase ASCII string),
+ derived from |name| in a user agent defined manner, or an empty string if no type
+ could be determined, taking into account the following <dfn export>file type guidelines</dfn>:
+
+ * User agents must return the {{Blob/type}} as an ASCII-encoded string in lower case,
+ such that when it is converted to a corresponding byte sequence,
+ it is a <a>parsable MIME type</a>,
+ or the empty string – 0 bytes – if the type cannot be determined.
+ * When the file is of type <code>text/plain</code>
+ user agents must NOT append a charset parameter to the <i>dictionary of parameters</i> portion of the media type [[!MIMESNIFF]].
Inconsistent capitalization of NOT / not (in next bullet)
> + represented as the number of milliseconds since the [=Unix Epoch=].
+1. Set |snapshot state|[<code><a for="file blob snapshot state">"last modified"</a></code>] to |last modified|.
+
+1. Let |name| be the file name of |native file|, converted to a string in a
+ user agent defined manner.
+1. Let |content type| be the mime type of |native file| (as a lowercase ASCII string),
+ derived from |name| in a user agent defined manner, or an empty string if no type
+ could be determined, taking into account the following <dfn export>file type guidelines</dfn>:
+
+ * User agents must return the {{Blob/type}} as an ASCII-encoded string in lower case,
+ such that when it is converted to a corresponding byte sequence,
+ it is a <a>parsable MIME type</a>,
+ or the empty string – 0 bytes – if the type cannot be determined.
+ * When the file is of type <code>text/plain</code>
+ user agents must NOT append a charset parameter to the <i>dictionary of parameters</i> portion of the media type [[!MIMESNIFF]].
+ * User agents must not attempt heuristic determination of encoding,
Since encoding is not specified for `text/plain` (per previous), this can't apply in that case. Should this be scoped to other text/* types?
> replacing any "/" character (U+002F SOLIDUS) with a ":" (U+003A COLON).
Note: Underlying OS filesystems use differing conventions for file name;
- with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte</a> sequences.
-
-3. Process {{FilePropertyBag}} dictionary argument by running the following substeps:
-
- 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string,
- let |t| be set to the {{BlobPropertyBag/type}} dictionary member.
- If |t| contains any characters outside the range U+0020 to U+007E,
- then set |t| to the empty string and return from these substeps.
- 2. Convert every character in |t| to [=ASCII lowercase=].
- 3. If the {{FilePropertyBag/lastModified}} member is provided,
- let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member.
- If it is not provided,
- set |d| to the current date and time
+ with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>.
spelling: should be "ambiguity"
>
-2. Let |n| be a new string of the same size as the {{fileName}} argument to the constructor.
- Copy every character from {{fileName}} to |n|,
+1. Let |n| be a new string of the same size as |fileName|.
+1. Copy every character from |fileName| to |n|,
Not new: use "code point" not "character" here and elsewhere.
> -
-3. Process {{FilePropertyBag}} dictionary argument by running the following substeps:
-
- 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string,
- let |t| be set to the {{BlobPropertyBag/type}} dictionary member.
- If |t| contains any characters outside the range U+0020 to U+007E,
- then set |t| to the empty string and return from these substeps.
- 2. Convert every character in |t| to [=ASCII lowercase=].
- 3. If the {{FilePropertyBag/lastModified}} member is provided,
- let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member.
- If it is not provided,
- set |d| to the current date and time
+ with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>.
+
+1. If |options|.{{FilePropertyBag/lastModified}} member is provided:
+ 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member.
consistency: "member" or "dictionary member"
> -
- 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string,
- let |t| be set to the {{BlobPropertyBag/type}} dictionary member.
- If |t| contains any characters outside the range U+0020 to U+007E,
- then set |t| to the empty string and return from these substeps.
- 2. Convert every character in |t| to [=ASCII lowercase=].
- 3. If the {{FilePropertyBag/lastModified}} member is provided,
- let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member.
- If it is not provided,
- set |d| to the current date and time
+ with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>.
+
+1. If |options|.{{FilePropertyBag/lastModified}} member is provided:
+ 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member.
+1. Otherwise:
+ 1. Let |d| be the current date and time
Current date / time (as millis since epoch) comes up several times in the spec. Factor it out into a definition.
> -
- 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string,
- let |t| be set to the {{BlobPropertyBag/type}} dictionary member.
- If |t| contains any characters outside the range U+0020 to U+007E,
- then set |t| to the empty string and return from these substeps.
- 2. Convert every character in |t| to [=ASCII lowercase=].
- 3. If the {{FilePropertyBag/lastModified}} member is provided,
- let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member.
- If it is not provided,
- set |d| to the current date and time
+ with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>.
+
+1. If |options|.{{FilePropertyBag/lastModified}} member is provided:
+ 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member.
+1. Otherwise:
+ 1. Let |d| be the current date and time
represented as the number of milliseconds since the <a>Unix Epoch</a>
(which is the equivalent of <code>Date.now()</code> [[ECMA-262]]).
Note: Since ECMA-262 {{Date}} objects convert to <code>long long</code> values
Maybe move this into the domintro ?
> -
- 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string,
- let |t| be set to the {{BlobPropertyBag/type}} dictionary member.
- If |t| contains any characters outside the range U+0020 to U+007E,
- then set |t| to the empty string and return from these substeps.
- 2. Convert every character in |t| to [=ASCII lowercase=].
- 3. If the {{FilePropertyBag/lastModified}} member is provided,
- let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member.
- If it is not provided,
- set |d| to the current date and time
+ with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>.
+
+1. If |options|.{{FilePropertyBag/lastModified}} member is provided:
+ 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member.
+1. Otherwise:
+ 1. Let |d| be the current date and time
represented as the number of milliseconds since the <a>Unix Epoch</a>
(which is the equivalent of <code>Date.now()</code> [[ECMA-262]]).
Note: Since ECMA-262 {{Date}} objects convert to <code>long long</code> values
I guess File constructor doesn't have a domintro yet. Add one? Blob too?
That'd be a good place to call out (in giant blinking text...) that the first argument is a array. :)
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/FileAPI/pull/154#pullrequestreview-419541114
Received on Wednesday, 27 May 2020 23:29:22 UTC