Re: File API: Blob and underlying file changes.

Dmitry Titov wrote:
> Hi,
>
> Does the Blob, which is obtained as File (so it refers to an actual file on
> disk) track the changes in the underlying file and 'mutates', or does it
> represent the 'snapshot' of the file, or does it become 'invalid'?
>   

Greetings Dmitry,

(And sorry for delay in response -- I just returned from vacation).

Currently in Fx 3.6, which implements the File API, the Blob "mutates" 
(to use your terminology).  A "snapshot" is best gained by using a read 
method (and treating the result string as a snapshot).
> Today, if a user selects a file using <input type=file>, and then the file
> on the disk changes before the 'submit' is clicked, the form will submit the
> latest version of the file.
> This may be a surprisingly popular use case, when user submits a file via
> form and wants to do 'last moment' changes in the file, after partial
> pre-populating the form. It works 'intuitively' today.

It is possible that two reads at different time intervals yield 
different results, if the underlying file has undergone changes between 
those time intervals.  This holds true for read methods that take a Blob 
or a File argument. 

The case of modifications *during* a read may be operating system dependent.

Your question suggests that the subsequent editor's draft should make 
these cases clearer.

> Now, if the page decides to use XHR to upload the file, I think
>
> var file = myInputElement.files[0];
> var xhr = ...
> xhr.send(file);
>
> should also send the version of the file that exists at the moment of
> xhr.send(file), not when user picked the file (for consistency with form
> action).
>   

Agreed.

> Assuming this is desired behavior, what should the following do:
>
> var file = myInputElement.files[0];
> var blob = file.slice(0, file.size);
> // ... now file on the disk changes ...
> xhr.send(blob);
>
> Will it:
> - send the new version of the whole file (and update blob.size?)
> - send captured number of bytes from the new version of the file (perhaps
> truncated since file may be shorter now)
> - send original bytes from the previous version of the file that existed
> when Blob was created (sort of 'copy on write')
> - throw exception
> ?
>   

Currently, no exceptions are raised in the slice operation; I wonder if 
that was the right choice.  It seems there are two ways of determining 
behavior for slice:

1. Treat file.slice(0, file.size) as semantically identical to what we 
would do for File or Blob arguments that have been modified on disk 
without calling slice.  This would mean that for size = 50000, if we have:

var blob1 = file.slice(0, 25000);
var blob2 = file.slice(25000, file.size);

And then we change the file since the slice call, so that now the file 
is of size = 20000, then

blob1 can be used with any read method that takes a blob argument, but 
it is now clamped on 20000, and blob2 will raise an error upon read.  
The editor's draft will have to spell out more such scenarios clearly, 
but basically we allow for modifications after the slice call, and clamp 
on meaningful results.  The goal would be to ensure slice calls behave 
as one would expect with <input type="file"/> (as in the use case where 
Blobs "mutate", to reuse Dmitry's initial term).


2. Treat file.slice(0, file.size) as semantically *different* to File or 
Blob arguments modified on disk.  If we determine that since the slice 
operation, the original file has been modified, we could return 0 slices 
or throw errors on all reads of slices.  In this case, slice is a 
special case, and does not retain the behavior of file selection via 
forms.  This is simpler, but may not be desirable.

I'd really like to hear from others about this topic, since I'm 
personally not sure which is best.   Strong opinions encouraged :-)

-- A*

Received on Tuesday, 12 January 2010 23:37:28 UTC