- From: Rick Waldron <waldron.rick@gmail.com>
- Date: Fri, 9 Aug 2013 19:02:37 -0400
- To: Jonas Sicking <jonas@sicking.cc>
- Cc: "public-script-coord@w3.org" <public-script-coord@w3.org>
- Message-ID: <CAHfnhfpZvmgSWJKMq7RtmLmLVf4F4OX__+Kg-HUtZhciHJqfYg@mail.gmail.com>
below... On Fri, Aug 9, 2013 at 6:15 PM, Jonas Sicking <jonas@sicking.cc> wrote: > On Fri, Aug 9, 2013 at 2:02 PM, Jonas Sicking <jonas@sicking.cc> wrote: > > Over the past few months a few of us at mozilla, with input from a lot > > of other people, has been iterating on a filesystem API. The goal of > > this filesystem API is first and foremost to expose a sandboxed > > filesystem to webpages. This filesystem would be origin-specific and > > would not allow accessing the user's OS filesystem. This avoids a lot > > of the security concerns around filesystem APIs. > > > > However it is expected that this API will eventually also be used for > > accessing real filesystems eventually, but there are a lot of security > > concerns that needs to be solved before we can create a real standard > > for that. Hence that is not the topic of this email. > > > > API summary: > > > > The proposed API introduces two new abstractions: A Directory object > > which allows manipulating files and directories within it, and a > > FileHandle object which allows holding an exclusive lock on a file > > while performing multiple read/write operations on it. > > > > The API intentionally reuses the already existing File abstraction as > > defined by [1] as we didn't want to have two different primitives for > > "a file". The File object has already been shipping in browsers for a > > while, so it's not an API that we expect to be able to make backwards > > incompatible changes to, which somewhat limits the design of the > > proposed filesystem API. > > > > Only adding two new abstractions was very intentional. We wanted to > > keep the API as small and simple as possible. So for example there is > > no abstraction for "a filesystem". Instead we simply let the root > > directory represent the filesystem. > > > > The API is entirely asynchronous since we don't expect implementations > > to be able to keep the whole filesystem in memory, and we don't want > > to force synchronous IO. But we've still tried to keep the API as > > friendly as possible. > > > > Detailed API: > > > > Apologies for using WebIDL here. I know it's not very popular with a > > lot of people on this list. And it's especially unfortunate in this > > API since the use of WebIDL to describe the API results in a lot of > > extra syntax in the description which doesn't actually affect the > > javascript that developers would write. > > > > Unfortunately I don't know of any other formal way of describing the > > API without spending tons of time typing up long descriptions of each > > function. > > > > partial interface Navigator { > > // This is what provides access to the sandboxed filesystem root. > > Promise<Directory> getFilesystem(optional FilesystemParameters > parameters); > > }; > > > > interface Directory { > > readonly attribute DOMString name; > > > > Promise<File> createFile(DOMString path, > > CreateFileOptions options); > > Promise<Directory> createDirectory(DOMString path); > > > > Promise<(File or Directory)> get(DOMString path); > > > > AbortableProgressPromise<void> > > move((DOMString or File or Directory) path, > > (DOMString or Directory or DestinationDict) dest); > > AbortableProgressPromise<void> > > copy((DOMString or File or Directory) path, > > (DOMString or Directory or DestinationDict) dest); > > Promise<boolean> remove((DOMString or File or Directory) path); > > Promise<boolean> removeDeep((DOMString or File or Directory) path); > > > > Promise<FileHandle> openRead((DOMString or File) path); > > Promise<FileHandleWritable> openWrite((DOMString or File) path, > > OpenWriteOptions options); > > > > EventStream<(File or Directory)> enumerate(optional DOMString path); > > EventStream<File> enumerateDeep(optional DOMString path); > > }; > > > > interface FileHandle > > { > > readonly attribute FileOpenMode mode; > > readonly attribute boolean active; > > > > attribute long long? offset; > > > > Promise<File> getFile(); > > AbortableProgressPromise< > > ArrayBuffer> read(unsigned long long size); > > AbortableProgressPromise<DOMString> readText(unsigned long long > > size, optional DOMString encoding = "utf-8"); > > > > void abort(); > > }; > > > > interface FileHandleWritable : FileHandle > > { > > AbortableProgressPromise<void> write((DOMString or ArrayBuffer or > > ArrayBufferView or Blob) value); > > > > Promise<void> setSize(optional unsigned long long size); > > > > Promise<void> flush(); > > }; > > > > partial interface URL { > > static DOMString? getPersistentURL(File file); > > } > > > > > > // WebIDL cruft that's largely transparent > > enum StorageType { "temporary", "persistent" }; > > dictionary FilesystemParameters { > > StorageType storage = "temporary"; > > }; > > > > dictionary CreateFileOptions { > > CreateIfExistsMode ifExists = "fail"; > > (DOMString or Blob or ArrayBuffer or ArrayBufferView) data; > > }; > > > > dictionary OpenWriteOptions { > > OpenIfNotExistsMode ifNotExists = "create"; > > OpenIfExistsMode ifExists = "open"; > > }; > > > > enum CreateIfExistsMode { "replace", "fail" }; > > enum OpenIfExistsMode { "open", "fail" }; > > enum OpenIfNotExistsMode { "create", "fail" }; > > > > dictionary DestinationDict { > > Directory dir; > > DOMString name; > > }; > > > > enum FileOpenMode { "readonly", "readwrite" }; > > > > API Description: > > > > I won't go into the details about each function as it's hopefully > > mostly obvious. A few general comments: > > > > The functions on Directory that accept DOMString arguments for > > filenames allow names like "path/to/file.txt". If the function creates a > > file, then it creates the intermediate directories. Such paths are > > always interpreted as relative to the directory itself, never relative > > to the root. > > > > We were thinking of *not* allowing paths that walk up the directory > > tree. So paths like "../foo", "..", "/foo/bar" or "foo/../bar" are not > > allowed. This to keep things simple and avoid security issues for the > > page. Attempting to use a path that contains a segment that is equal > > to ".." or ".", or any path which starts with "/" will cause an error. > > This way we can add support for this later if desired. > > > > Likewise, passing a File object to an operation of Directory where the > > File object isn't contained in that directory or its descendents also > > results in an error. > > > > One thing that is probably not obvious is how the FileHandle.location > > attribute works. This attribute is used by the read/readText/write > > functions to select where the read or write operation starts. When > > .read is called, it uses the current value of .location to determine > > where the reading starts. It then fires off an asynchronous read > > operation. It finally synchronously increases .location by the amount > > of the 'size' argument before returning. Same thing for .write() and > > .readText(). > > > > This means that the caller can simply set .location and then fire off > > multiple read or write operations which automatically will happen > > staggered in the file. It also means that the caller can set the > > location for next operation by simply setting .location, or can check > > the current location by simply getting .location. > > > > Setting .offset to null means "go to the end". This is why there is no > > openAppend function. Calling openWrite and then setting .offset to > > null before writing results in an append. > > > > Note that getting or setting .offset does not need to synchronously > > call seek, or do any IO operations, in the implementation. Instead the > > implementation simply tracks .offset in the API implementation. > > Whenever a read or write operation is scheduled, the current .offset > > is sent along with the operation information to the IO thread and the > > seek can happen there. Many times the implementation can optimize out > > the seek entirely. > > > > The FileHandle class automatically closes itself as soon as the page > > stops posting further calls to .read/.readBinary/.write to it. This > > happens once the last Promise returned from one of those operations > > has been resolved, without further calls to .read/.readBinary/.write > > having happened. This is similar to IDB transactions, though obviously > > there are no transactional semantics here. I.e. there is no way to > > roll back any changes. > > > > Open Questions: > > > > There are a few things that we did have disagreements on and which > > would be worth debating. > > > > Is the setup around the FileHandle.offset attribute a good idea? Some > > people found it confusingly different from posix. > > > > Can we get rid of the the non-recursive remove() function. The > > removeRecusive() function has the same capabilities, except that > > removeRecusive doesn't produce an error if you attempt to delete a > > non-empty directory. > > > > Can we get rid of the copy() function? Copy operations are certainly > > common to expose in UIs, but they can be easily implemented > > programmatically, so having it in the API isn't strictly needed. > > > > Should we add an openAppend function which always appends for all > > writes. Note that since FileHandle always holds an exclusive lock on > > the file, there is no risk that other actors will append to the file > > as long as a FileHandle is being used. > > > > Finally, should we remove the Directory abstraction? It's not needed > > given that you can directly interact with files in subdirectories. But > > it does provide the ability to do some capability management. I.e. > > holding a Directory object enables you to interact with the files in > > that directory and its subdirectories, but there is no way to reach > > out to a parent directory. Directory objects also is a familiar > > concept in filesystem APIs, so it seems natural to have it even though > > it's not strictly needed. > > > > [1] http://dev.w3.org/2006/webapi/FileAPI/ > > After all that, of course I forgot to include examples of what the API > looks like when used. > > // Save some downloaded data into a new file: > navigator.getFilesystem().then(function(root) { > root.createFile("myfile.txt", { data: xhr.response }); > }); > > // Append 5 bytes to the end of a large existing file: > navigator.getFilesystem().then(function(root) { > return root.openWrite("largefile.dat"); > }).then(function(handle) { > handle.offset = null; > return handle.write(new Uint8Array([1, 1, 2, 3, 5])); > }); > > // Increase the 100th byte in large existing file: > var fileHandle; > navigator.getFilesystem().then(function(root) { > return root.openWrite("dir/highscores"); > }).then(function(handle) { > fileHandle = handle; > fileHandle.offset = 100; > return fileHandle.read(1); > }).then(function(buffer) { > assert(buffer.byteLength === 1); > var view = new Uint8Array(buffer); > view[0]++; > fileHandle.location--; > return handle.write(buffer); > }); > > / Jonas > > I didn't see any rationale that explains the decision to hang this off the navigator object, unless there is a definite reason, then perhaps this should be it's own [[Global]] object? I apologize for sounding like a broken record, but the "navigator" has nothing to do with the File System. partial interface Window { static FileSystem; } interface FileSystem { Promise<Directory> get(optional FilesystemParameters parameters); } Everything else could stay as-is. The examples then look like: (I used "get" in the IDL, but will use "request" in the examples, because I'm using my imagination and I think it looks nice) // Save some downloaded data into a new file: FileSystem.request().then(function(root) { root.createFile("myfile.txt", { data: xhr.response }); }); // Append 5 bytes to the end of a large existing file: FileSystem.request().then(function(root) { return root.openWrite("largefile.dat"); }).then(function(handle) { handle.offset = null; return handle.write(new Uint8Array([1, 1, 2, 3, 5])); }); // Increase the 100th byte in large existing file: var fileHandle; FileSystem.request().then(function(root) { return root.openWrite("dir/highscores"); }).then(function(handle) { fileHandle = handle; fileHandle.offset = 100; return fileHandle.read(1); }).then(function(buffer) { assert(buffer.byteLength === 1); var view = new Uint8Array(buffer); view[0]++; fileHandle.location--; return handle.write(buffer); }); Or better yet, FileSystem is a constructor that produces FileSystem objects that have a "get" or "request" method, now it's a reusable object... var fs = new FileSystem(); // Save some downloaded data into a new file: fs.request().then(function(root) { root.createFile("myfile.txt", { data: xhr.response }); }); // Append 5 bytes to the end of a large existing file: fs.request().then(function(root) { return root.openWrite("largefile.dat"); }).then(function(handle) { handle.offset = null; return handle.write(new Uint8Array([1, 1, 2, 3, 5])); }); // Increase the 100th byte in large existing file: var fileHandle; fs.request().then(function(root) { return root.openWrite("dir/highscores"); }).then(function(handle) { fileHandle = handle; fileHandle.offset = 100; return fileHandle.read(1); }).then(function(buffer) { assert(buffer.byteLength === 1); var view = new Uint8Array(buffer); view[0]++; fileHandle.location--; return handle.write(buffer); }); I don't know how set in stone the naming is, but you might also consider reviewing some prior art (http://nodejs.org/api/fs.html) for method names call signatures. Rick
Received on Friday, 9 August 2013 23:03:24 UTC