Re: Request for feedback: Filesystem API

On Sat, Aug 10, 2013 at 5:57 PM, Domenic Denicola
<domenic@domenicdenicola.com> wrote:
> From: Jonas Sicking [mailto:jonas@sicking.cc]
>
>> Like Brendan points out, what is considered the "low-level capabilities" isn't always obvious.
>
> I think a good guide is whether it's "atomic" or not. As such "move" is definitely atomic, whereas the others are not as much, so my apologies for including it there. A new concern, which this time I'll phrase as a question---is moving, or removing, a directory atomic?

remove() and removeDeep() are definitely atomic.

move() is atomic within the same filesystem. If we grow multiple
filesystem API backends in addition to the per-origin sandboxed one,
such as sd-card, system "pictures folder" and server-backed remote
folders, we probably can't guarantee atomicness when moving between
different backends. All of these backends exist in existing
browser-like implementations FWIW.

For copy() we probably wouldn't write guarantees into the spec. But I
would push for making it atomic in the Firefox implementation as often
as possible since it'll reduce the risk of website bugs.

I don't see how we could make either enumerateDeep() or enumerate() atomic?

> The atomicity is more important than you might think, because of how it impacts error-handling, parallel-versus-serial operations, and incremental progress. François gets at this in his response, when he says:

I'm well aware that atomicity is critical. That's why we designed the
FileHandle API the way we did. I.e. its design encourages people to
make their modifications to files atomic.

>> On the other hand, the other functions suffer from huge design challenges (what to do in case of conflict? are hidden files copied too? what happens if only one file is corrupted?) and I would probably leave then out, too. Librairies can fill the gap and we can learn from experiments before standardising.
>
> For example, when copying, what happens in the case of a transient filesystem error or corrupted sections of a file? What is your retry strategy? Do you copy all that you can, and leave the rest of the file filled with "XXX"? (Might make sense for text files!) When moving or copying or removing a directory, which I *think* are non-atomic operations, what happens if only one file can't be moved/copied/removed? Do you retry? Do you fail the whole process? Do you do a rollback? Do you continue on as best you can? How important is deterministic order in such batch operations---e.g., do you try to remove all files in a directory in sequence, or in parallel? You can imagine other such issues.

These are all definitely important questions and needs to be answered.

My initial naive answer would be: Stop at first error and report the
name of the file where the error happened. No rollbacks!

> On specific issues:
>
> - If you're going to keep copy, I'd keep file-copying only, not directory copying. (Notably, Node.js doesn't have any copy APIs, and I actually have never felt a need for them or seen other people bemoaning their lack. I could easily be missing those discussions though.)

You can do file-copying using createFile and pass in an existing File
object. So I think you are simply arguing for copy() to be removed.

> - On BlockStorage vs. FS: no, we don't need to reinvent a FS on top of Block Storage. Filesystems are appropriately low-level enough that we should definitely be exposing them to web developers! Just, y'know, the low-level parts. One of those might be an AsyncLock so that they can build higher-level parts themselves!

Thank you. So it sounds like we are in agreement that the lowest-level
primitive isn't always the right one to expose. The question is how
low to go and generally that is a judgement call.

I definitely think that we should expose AsyncLock, but in the spirit
of keeping APIs small and simple, the plan was always to do that
separately from the filesystem API. Help appreciated.

/ Jonas

Received on Tuesday, 13 August 2013 17:31:09 UTC