Re: File API: Blob and underlying file changes.

It seems that we feel that when a File object is sent via either Form or
XHR, the latest underlying version should be used. When we get a slice via
Blob.slice, we assume that the underlying file data is stable since then.

So for uploader scenario, we need to cut a big file into multiple pieces.
With current File API spec, we will have to do something like the following
to make sure that all pieces are cut from a stable file.
    var file = myInputElement.files[0];
    var blob = file.slice(0, file.size);
    var piece1 = blob.slice(0, 1000);
    var piece2 = blob.slice(1001, 1000);
    ...

The above seems a bit ugly. If we want to make it clean, what Dmitry
proposed above seems to be reasonable. But it would require non-trivial spec
change.


On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov <dimich@chromium.org> wrote:

> Atomic read is obviously a nice thing - it would be hard to program against
> API that behaves as unpredictably as a single read operation that reads half
> of old content and half of new content.
>
> At the same note, it would be likely very hard to program against Blob
> objects if they could change underneath unpredictably. Imagine that we need
> to build an uploader that cuts a big file in multiple pieces and sends those
> pieces to the servers so they will be stitched together later. If during
> this operation the underlying file changes and this changes all the pieces
> that Blobs refer to (due to clamping and just silent change of content), all
> the slicing/stitching assumptions are invalid and it's hard to even notice
> since blobs are simply 'clamped' silently. Some degree of mess is possible
> then.
>
> Another use case could be a JPEG image processor that uses slice() to cut
> the headers from the image file and then uses info from the headers to cut
> further JFIF fields from the file (reading EXIF and populating local
> database of images for example). Changing the file in the middle of that is
> bad.
>
> It seems the typical use cases that will need Blob.slice() functionality
> form 'units of work' where Blob.slice() is used with likely assumption that
> underlying data is stable and does not change silently. Such a 'unit of
> work'  should fail as a whole if underlying file changes. One way to achieve
> that is to reliably fail operations with 'derived' Blobs and even perhaps
> have a 'isValid' property on it. 'Derived' Blobs are those obtained via
> slice(), as opposite to 'original' Blobs that are also File.
>
> One disadvantage of this approach is that it implies that the same Blob has
> 2 possible behaviors - when it is obtained via Blob.slice() (or other
> methods) vs is a File.
>
> It all could be a bit cleaner if File did not derive from Blob, but instead
> had getAsBlob() method - then it would be possible to say that Blobs are
> always immutable but may become 'invalid' over time if underlying data
> changes. The FileReader can then be just a BlobReader and have cleaner
> semantics.
>
> If that was the case, then xhr.send(file) would capture the state of file
> at the moment of sending, while xhr.send(blob) would fail with exception if
> the blob is 'invalid' at the moment of send() operation. This would keep
> compatibility with current behavior and avoid duplicity of Blob behavior.
> Quite a change to the spec though...
>
> Dmitry
>
> On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking <jonas@sicking.cc> wrote:
>
>> On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince <cprince@google.com> wrote:
>> >> For the record, I'd like to make the read "atomic", such that you can
>> >> never get half a file before a change, and half after. But it likely
>> >> depends on what OSs can enforce here.
>> >
>> > I think *enforcing* atomicity is difficult across all OSes.
>> >
>> > But implementations can get nearly the same effect by checking the
>> > file's last modification time at the start + end of the API call.  If
>> > it has changed, the read operation can throw an exception.
>>
>> I'm talking about during the actual read. I.e. not related to the
>> lifetime of the File object, just related to the time between the
>> first 'progress' event, and the 'loadend' event. If the file changes
>> during this time there is no way to fake atomicity since the partial
>> file has already been returned.
>>
>> / Jonas
>>
>
>

Received on Thursday, 14 January 2010 22:42:24 UTC