Polished FileSystem API proposal

Hi All,

Yesterday a few of us at mozilla went through the FileSystem API
proposal we previously sent [1] and tightened it up.

Executive Summary (aka TL;DR):
Below is the mozilla proposal for a simplified filesystem API. It
contains two new abstractions, a Directory object which allows
manipulating files and directories within it, and a FileHandle object
which allows holding an exclusive lock on a file while performing
multiple read/write operations on it.

It's largely modeled after posix, but because we've tried to keep it
author friendly despite it's asynchronous nature, it differs in a few
cases.

There are opportunities for further simplifications by straying
further from posix. It's unclear if this is desired or not.

Detailed proposal:

partial interface Navigator {
  Promise<Directory> getFilesystem(optional FilesystemParameters parameters);
};

interface Directory {
  readonly attribute DOMString name;

  Promise<File> createFile(DOMString path, MakeFileOptions options);
  Promise<Directory> createDirectory(DOMString path);

  Promise<(File or Directory)> get(DOMString path);

  Promise<void> move((DOMString or File or Directory) entry,
                     (DOMString or Directory or DestinationDict) dest);
  Promise<void> copy((DOMString or File or Directory) entry,
                     (DOMString or Directory or DestinationDict) dest);
  Promise<boolean> remove((DOMString or File or Directory) path,
                       optional DeleteMode recursive = "nonrecursive");

  Promise<FileHandle> openRead((DOMString or File) file);
  Promise<FileHandleWritable> openWrite((DOMString or File) file,
        optional CreateMode createMode = "createifneeded");
  Promise<FileHandleWritable> openAppend((DOMString or File) file,
        optional CreateMode createMode = "createifneeded");

  EventStream<(File or Directory)> enumerate();
  EventStream<File> enumerateDeep();
};

interface FileHandle
{
  readonly attribute FileOpenMode mode;
  readonly attribute boolean active;

  attribute long long? location;

  Promise<File> getFile();
  AbortableProgressPromise<ArrayBuffer> read(unsigned long long size);
  AbortableProgressPromise<DOMString> readText(unsigned long long
size, optional DOMString encoding = "utf-8");

  void abort();
};

interface FileHandleWritable : FileHandle
{
  AbortableProgressPromise<void> write((DOMString or ArrayBuffer or
ArrayBufferView or Blob) value);

  Promise<void> setSize(optional unsigned long long size);

  Promise<void> flush();
};

partial interface URL {
  static DOMString? getPersistentURL(File file);
}

// WebIDL cruft that's largely transparent
enum PersistenceType { "temporary", "persistent" };
dictionary FilesystemParameters {
  PersistenceType storage = "temporary";
};

dictionary MakeFileOptions {
  boolean overwriteIfExists = false;
  (DOMString or Blob or ArrayBuffer or ArrayBufferView) data;
};

enum CreateMode { "createifneeded", "dontcreate" }
enum DeleteMode { "recursive", "nonrecursive" }

dictionary DestinationDict {
  Directory dir;
  DOMString name;
};

enum FileOpenMode { "read", "write", "append" };

So this API introduces 2 classes: Directory and FileHandle. Directory
allows manipulation of the files and directories stored inside that
directory. FileHandle represents an exclusively opened file and allows
manipulation of the file contents.

The behavior is hopefully mostly obvious. A few general comments:

The functions on Directory that accept DOMString arguments for
filenames allow names like "path/to/file". If the function creates a
file, then it creates the intermediate directories. Such paths are
always interpreted as relative to the directory itself, never relative
to the root.

We were thinking of *not* allowing paths that walk up the directory
tree. So paths like "../foo", "..", "/foo/bar" or "foo/../bar" are not
allowed. This to keep things simple and avoid security issues for the
page.

Likewise, passing a File object to an operation of Directory where the
File object isn't contained in that directory or its descendents also
results in an error.

One thing that is probably not obvious is how the FileHandle.location
attribute works. This attribute is used by the read/readText/write
functions to select where the read or write operation starts. When
.read is called, it uses the current value of .location to determine
where the reading starts. It then fires off an asynchronous read
operation. It finally synchronously increases .location by the amount
of the 'size' argument before returning. Same thing for .write() and
.readText().

This means that the caller can simply set .location and then fire off
multiple read or write operations which automatically will happen
staggered in the file. It also means that the caller can set the
location for next operation by simply setting .location, or can check
the current location by simply getting .location.

Setting .location to null means "go to the end".

Note that getting or setting .location does not need to synchronously
call seek, or do any IO operations, in the implementation. Instead the
implementation simply tracks .location in the API implementation.
Whenever a read or write operation is scheduled, the current .location
is sent along with the operation information to the IO thread and the
seek can happen there. Many times the implementation can optimize out
the seek entirely.

The FileHandle class automatically closes itself as soon as the page
stops posting further calls to .read/.readBinary/.write to it. This
happens once the last Promise returned from one of those operations
has been resolved, without further calls to .read/.readBinary/.write
having happened. This is similar to IDB transactions, though obviously
there are no transactional semantics here. I.e. there is no way to
roll back any changes.

There are a few things that we did have disagreements on and which
would be worth debating:

Is the setup around the FileHandle.location attribute a good idea?
Some people found it confusingly different from posix.

There's a few more "mode" flags in various functions than I like. In
particular the "recursive" flag for Directory.remove was debated. Do
we really need the ability to call .remove on a directory and have it
fail if the directory isn't empty? And should it really be the default
behavior?

Likewise, can we get rid of the "createifneeded" vs. "dontcreate"
switch for .openWrite()/.openAppend()?

What about the overwriteIfExists flag for createFile?

Do we really need the .openAppend() function? Or is it ok to ask
people to use .openWrite() and then go to the end before writing?

Finally, there was debate about if we need a Directory abstraction at
all, or if we could create something simpler which relied on string
munging instead.

Some examples of what code would look like:

// Save some downloaded data into a new file:
navigator.getFilesystem().then(function(root) {
  root.createFile("myfile.txt", { data: xhr.response });
});

// Append 5 bytes to the end of a large existing file:
navigator.getFilesystem().then(function(root) {
  return root.openAppend("largefile.dat");
}).then(function(handle) {
  return handle.write(new Uint8Array([1, 1, 2, 3, 5]));
});

// Increase the 100th byte in large existing file:
var fileHandle;
navigator.getFilesystem().then(function(root) {
  return root.openAppend("dir/highscores");
}).then(function(handle) {
  fileHandle = handle;
  fileHandle.location = 100;
  return handle.read(1);
}).then(function(buffer) {
  assert(buffer.byteLength === 1);
  var view = new Uint8Array(buffer);
  view[0]++;
  fileHandle.location--;
  return handle.write(buffer);
});

I hope to send this proposal to public-script-coord soon after some
debate on this list.

[1] http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0382.html

/ Jonas

Received on Saturday, 13 July 2013 00:32:41 UTC