- From: Joshua Bell <notifications@github.com>
- Date: Wed, 27 May 2020 16:29:06 -0700
- To: w3c/FileAPI <FileAPI@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <w3c/FileAPI/pull/154/review/419541114@github.com>
@inexorabletash approved this pull request. I probably didn't catch everything, but I made it to the end! > <div algorithm="process-blob-parts"> -To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn> given a sequence of {{BlobPart}}'s |parts| -and {{BlobPropertyBag}} |options|, +To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn> +given a sequence of {{BlobPart}}'s |blobParts| and {{BlobPropertyBag}} |options|, Nit: `{{BlobPart}}s` (plural s, not a possessive s) > <div algorithm="process-blob-parts"> -To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn> given a sequence of {{BlobPart}}'s |parts| -and {{BlobPropertyBag}} |options|, +To <dfn lt="process blob parts|processing blob parts">process blob parts</dfn> +given a sequence of {{BlobPart}}'s |blobParts| and {{BlobPropertyBag}} |options|, Also, maybe list rather than sequence? (I'm hazy on when to switch from IDL terms to Infra terms) > -1. For each |element| in |parts|: +1. Let |bytes| be an empty [=byte sequence=]. This algorithm concatenates adjacent non-blob parts rather than just ending up with more parts. Although this probably matches implementations, it doesn't seem to be observable and makes the algorithm more complicated. Is there a good reason for it? > @@ -361,16 +458,179 @@ run the following steps: 1. If |element| is a {{BufferSource}}, <a lt="get a copy of the buffer source">get Not new, but would be more readable with "Otherwise, if ..." > @@ -361,16 +458,179 @@ run the following steps: 1. If |element| is a {{BufferSource}}, <a lt="get a copy of the buffer source">get a copy of the bytes held by the buffer source</a>, and append those bytes to |bytes|. - 1. If |element| is a {{Blob}}, - append the bytes it represents to |bytes|. + 1. If |element| is a {{Blob}}: "Otherwise, if ..." > +1. Set |result|.[=read algorithm=] to the [=multipart blob read steps=]. + +1. Return |result|. + +</div> + +A <dfn>multipart blob read state</dfn> is a [=struct=] consisting of: + +<dl dfn-for="multipart blob read state"> +: <dfn>parts</dfn> +:: A [=queue=] of [=blob data descriptions=], representing the not yet read parts of the blob. +: <dfn>offset</df> +:: A number, representing the byte offset in the remaining blob parts + from which to start returning data. +: <dfn>nested blob data</dfn> +:: `undefined` or a [=blob data description=]. This is `undefined` unless otherwise specified. I don't think we typically style undefined (or null, true, false) as code ? (Likely not consistent across specs) > + +## Constructors ## {#constructorBlob} + + +<div algorithm="blob-constructor"> +The <dfn constructor for=Blob lt="Blob(blobParts, options)|Blob(blobParts)|Blob()"><code>new Blob(|blobParts|, |options|)</code></dfn> constructor steps are: + +1. Let |blob data| be the result of [=processing blob parts=] given |blobParts| and |options|. +1. Set [=this=].[=[[data]]=] to |blob data|. + +1. Let |type| be an empty string. +1. If the {{BlobPropertyBag/type}} member of the {{Blob/Blob(blobParts, options)/options}} argument is not the empty string, + run the following sub-steps: + + 1. Let |type| be the {{BlobPropertyBag/type}} dictionary member. + If |type| contains any characters outside the range U+0020 to U+007E, This could be a separate substep > + + +<div algorithm="blob-constructor"> +The <dfn constructor for=Blob lt="Blob(blobParts, options)|Blob(blobParts)|Blob()"><code>new Blob(|blobParts|, |options|)</code></dfn> constructor steps are: + +1. Let |blob data| be the result of [=processing blob parts=] given |blobParts| and |options|. +1. Set [=this=].[=[[data]]=] to |blob data|. + +1. Let |type| be an empty string. +1. If the {{BlobPropertyBag/type}} member of the {{Blob/Blob(blobParts, options)/options}} argument is not the empty string, + run the following sub-steps: + + 1. Let |type| be the {{BlobPropertyBag/type}} dictionary member. + If |type| contains any characters outside the range U+0020 to U+007E, + then set |type| to the empty string and return from these substeps. + 1. Convert every character in |type| to [=ASCII lowercase=]. This could be an "Otherwise,..." and then "return from these substeps" wouldn't be needed. > - Note: Use of the {{Blob/type}} attribute informs the [=package data=] algorithm - and determines the `Content-Type` header when [=/fetching=] [=blob URLs=]. -</dl> +</div> + +The <dfn attribute for=Blob id="dfn-size">size</dfn> getter steps are to return [=this=].[=[[data]]=].[=blob data/size=]. + +<div class="note domintro"> +: |blob| . {{Blob/type}} +:: The ASCII-encoded string in lower case representing the media type of the {{Blob}}, + or an empty string if the type cannot be determined. + + The {{Blob/type}} attribute can be set by the web application itself through constructor invocation + and through the {{Blob/slice()}} call; + + Note: The type of a {{Blob}} is considered a <a>parsable MIME type</a>, Not new, but grammar here is awkward. Since this is non-normative, it can also be simplified, e.g.: Note: The type of a Blob is considered a _parsable MIME type_ [no comma] if performing the _parse a MIME type_ algorithm on [not to] the Blob object's type (as an ASCII-encoded byte sequence) does not return failure. > + +1. Let |relativeStart| be 0. +1. If |start| is not `undefined`: + 1. If |start| < 0, set |relativeStart| to <code>max([=this=].[=[[data]]=].[=blob data/size=] + |start|, 0)</code>. + 1. Otherwise, set |relativeStart| to <code>min(|start|, [=this=].[=[[data]]=].[=blob data/size=])</code>. + +1. Let |relativeEnd| be [=this=].[=[[data]]=].[=blob data/size=]. +1. If |end| is not `undefined`: + 1. If |end| < 0, set |relativeEnd| to <code>max([=this=].[=[[data]]=].[=blob data/size=] + |end|, 0)</code>. + 1. Otherwise, set |relativeEnd| to <code>min(|end|, [=this=].[=[[data]]=].[=blob data/size=])</code>. + +1. Let |span| be <code>max((relativeEnd - relativeStart), 0)</code>. + +1. Let |relativeContentType| be an empty string. +1. If |contentType| is not `undefined`: + 1. If |contentType| does not contain any characters outside the range of U+0x0020 to U+0x007E: There are enough occurrences of "if not ASCII, empty string. Otherwise, ASCII-lowercase it" that a separate algorithm seems like a good idea. > + +<div algorithm="create a file"> +To <dfn export>create a file backed {{File}} object</dfn> for a given |native file|, +run these steps: + +1. Let |snapshot state| be an empty [=map=]. +1. Set |snapshot state|[<code><a for="file blob snapshot state">"file"</a></code>] to |native file|. +1. Let |last modified| be the last time |native file| was modified, + as the number of milliseconds since the [=Unix Epoch=]. + If this can't be determined, set |last modified| to the current date and time + represented as the number of milliseconds since the [=Unix Epoch=]. +1. Set |snapshot state|[<code><a for="file blob snapshot state">"last modified"</a></code>] to |last modified|. + +1. Let |name| be the file name of |native file|, converted to a string in a + user agent defined manner. +1. Let |content type| be the mime type of |native file| (as a lowercase ASCII string), nit: capitalize MIME > +<div algorithm="create a file"> +To <dfn export>create a file backed {{File}} object</dfn> for a given |native file|, +run these steps: + +1. Let |snapshot state| be an empty [=map=]. +1. Set |snapshot state|[<code><a for="file blob snapshot state">"file"</a></code>] to |native file|. +1. Let |last modified| be the last time |native file| was modified, + as the number of milliseconds since the [=Unix Epoch=]. + If this can't be determined, set |last modified| to the current date and time + represented as the number of milliseconds since the [=Unix Epoch=]. +1. Set |snapshot state|[<code><a for="file blob snapshot state">"last modified"</a></code>] to |last modified|. + +1. Let |name| be the file name of |native file|, converted to a string in a + user agent defined manner. +1. Let |content type| be the mime type of |native file| (as a lowercase ASCII string), + derived from |name| in a user agent defined manner, or an empty string if no type Are UAs allowed to determine MIME type in other ways, e.g. from content sniffing, extended attributes, etc ? i.e. does the "derived from _name_" need to be part of the steps? > + If this can't be determined, set |last modified| to the current date and time + represented as the number of milliseconds since the [=Unix Epoch=]. +1. Set |snapshot state|[<code><a for="file blob snapshot state">"last modified"</a></code>] to |last modified|. + +1. Let |name| be the file name of |native file|, converted to a string in a + user agent defined manner. +1. Let |content type| be the mime type of |native file| (as a lowercase ASCII string), + derived from |name| in a user agent defined manner, or an empty string if no type + could be determined, taking into account the following <dfn export>file type guidelines</dfn>: + + * User agents must return the {{Blob/type}} as an ASCII-encoded string in lower case, + such that when it is converted to a corresponding byte sequence, + it is a <a>parsable MIME type</a>, + or the empty string – 0 bytes – if the type cannot be determined. + * When the file is of type <code>text/plain</code> + user agents must NOT append a charset parameter to the <i>dictionary of parameters</i> portion of the media type [[!MIMESNIFF]]. Inconsistent capitalization of NOT / not (in next bullet) > + represented as the number of milliseconds since the [=Unix Epoch=]. +1. Set |snapshot state|[<code><a for="file blob snapshot state">"last modified"</a></code>] to |last modified|. + +1. Let |name| be the file name of |native file|, converted to a string in a + user agent defined manner. +1. Let |content type| be the mime type of |native file| (as a lowercase ASCII string), + derived from |name| in a user agent defined manner, or an empty string if no type + could be determined, taking into account the following <dfn export>file type guidelines</dfn>: + + * User agents must return the {{Blob/type}} as an ASCII-encoded string in lower case, + such that when it is converted to a corresponding byte sequence, + it is a <a>parsable MIME type</a>, + or the empty string – 0 bytes – if the type cannot be determined. + * When the file is of type <code>text/plain</code> + user agents must NOT append a charset parameter to the <i>dictionary of parameters</i> portion of the media type [[!MIMESNIFF]]. + * User agents must not attempt heuristic determination of encoding, Since encoding is not specified for `text/plain` (per previous), this can't apply in that case. Should this be scoped to other text/* types? > replacing any "/" character (U+002F SOLIDUS) with a ":" (U+003A COLON). Note: Underlying OS filesystems use differing conventions for file name; - with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte</a> sequences. - -3. Process {{FilePropertyBag}} dictionary argument by running the following substeps: - - 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string, - let |t| be set to the {{BlobPropertyBag/type}} dictionary member. - If |t| contains any characters outside the range U+0020 to U+007E, - then set |t| to the empty string and return from these substeps. - 2. Convert every character in |t| to [=ASCII lowercase=]. - 3. If the {{FilePropertyBag/lastModified}} member is provided, - let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member. - If it is not provided, - set |d| to the current date and time + with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>. spelling: should be "ambiguity" > -2. Let |n| be a new string of the same size as the {{fileName}} argument to the constructor. - Copy every character from {{fileName}} to |n|, +1. Let |n| be a new string of the same size as |fileName|. +1. Copy every character from |fileName| to |n|, Not new: use "code point" not "character" here and elsewhere. > - -3. Process {{FilePropertyBag}} dictionary argument by running the following substeps: - - 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string, - let |t| be set to the {{BlobPropertyBag/type}} dictionary member. - If |t| contains any characters outside the range U+0020 to U+007E, - then set |t| to the empty string and return from these substeps. - 2. Convert every character in |t| to [=ASCII lowercase=]. - 3. If the {{FilePropertyBag/lastModified}} member is provided, - let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member. - If it is not provided, - set |d| to the current date and time + with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>. + +1. If |options|.{{FilePropertyBag/lastModified}} member is provided: + 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member. consistency: "member" or "dictionary member" > - - 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string, - let |t| be set to the {{BlobPropertyBag/type}} dictionary member. - If |t| contains any characters outside the range U+0020 to U+007E, - then set |t| to the empty string and return from these substeps. - 2. Convert every character in |t| to [=ASCII lowercase=]. - 3. If the {{FilePropertyBag/lastModified}} member is provided, - let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member. - If it is not provided, - set |d| to the current date and time + with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>. + +1. If |options|.{{FilePropertyBag/lastModified}} member is provided: + 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member. +1. Otherwise: + 1. Let |d| be the current date and time Current date / time (as millis since epoch) comes up several times in the spec. Factor it out into a definition. > - - 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string, - let |t| be set to the {{BlobPropertyBag/type}} dictionary member. - If |t| contains any characters outside the range U+0020 to U+007E, - then set |t| to the empty string and return from these substeps. - 2. Convert every character in |t| to [=ASCII lowercase=]. - 3. If the {{FilePropertyBag/lastModified}} member is provided, - let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member. - If it is not provided, - set |d| to the current date and time + with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>. + +1. If |options|.{{FilePropertyBag/lastModified}} member is provided: + 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member. +1. Otherwise: + 1. Let |d| be the current date and time represented as the number of milliseconds since the <a>Unix Epoch</a> (which is the equivalent of <code>Date.now()</code> [[ECMA-262]]). Note: Since ECMA-262 {{Date}} objects convert to <code>long long</code> values Maybe move this into the domintro ? > - - 1. If the {{BlobPropertyBag/type}} member is provided and is not the empty string, - let |t| be set to the {{BlobPropertyBag/type}} dictionary member. - If |t| contains any characters outside the range U+0020 to U+007E, - then set |t| to the empty string and return from these substeps. - 2. Convert every character in |t| to [=ASCII lowercase=]. - 3. If the {{FilePropertyBag/lastModified}} member is provided, - let |d| be set to the {{FilePropertyBag/lastModified}} dictionary member. - If it is not provided, - set |d| to the current date and time + with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to <a>byte sequences</a>. + +1. If |options|.{{FilePropertyBag/lastModified}} member is provided: + 1. Let |d| be |options|.{{FilePropertyBag/lastModified}} dictionary member. +1. Otherwise: + 1. Let |d| be the current date and time represented as the number of milliseconds since the <a>Unix Epoch</a> (which is the equivalent of <code>Date.now()</code> [[ECMA-262]]). Note: Since ECMA-262 {{Date}} objects convert to <code>long long</code> values I guess File constructor doesn't have a domintro yet. Add one? Blob too? That'd be a good place to call out (in giant blinking text...) that the first argument is a array. :) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/w3c/FileAPI/pull/154#pullrequestreview-419541114
Received on Wednesday, 27 May 2020 23:29:22 UTC