- From: Rick Waldron <waldron.rick@gmail.com>
- Date: Fri, 9 Aug 2013 19:34:57 -0400
- To: Jonas Sicking <jonas@sicking.cc>
- Cc: "public-script-coord@w3.org" <public-script-coord@w3.org>
- Message-ID: <CAHfnhfqNNKwaQJ7yg9b34Y1Tn_gG+48VSLG8-B9JZN-zR3D-pA@mail.gmail.com>
On Fri, Aug 9, 2013 at 7:22 PM, Jonas Sicking <jonas@sicking.cc> wrote: > On Fri, Aug 9, 2013 at 4:02 PM, Rick Waldron <waldron.rick@gmail.com> > wrote: > > below... > > > > > > On Fri, Aug 9, 2013 at 6:15 PM, Jonas Sicking <jonas@sicking.cc> wrote: > >> > >> On Fri, Aug 9, 2013 at 2:02 PM, Jonas Sicking <jonas@sicking.cc> wrote: > >> > Over the past few months a few of us at mozilla, with input from a lot > >> > of other people, has been iterating on a filesystem API. The goal of > >> > this filesystem API is first and foremost to expose a sandboxed > >> > filesystem to webpages. This filesystem would be origin-specific and > >> > would not allow accessing the user's OS filesystem. This avoids a lot > >> > of the security concerns around filesystem APIs. > >> > > >> > However it is expected that this API will eventually also be used for > >> > accessing real filesystems eventually, but there are a lot of security > >> > concerns that needs to be solved before we can create a real standard > >> > for that. Hence that is not the topic of this email. > >> > > >> > API summary: > >> > > >> > The proposed API introduces two new abstractions: A Directory object > >> > which allows manipulating files and directories within it, and a > >> > FileHandle object which allows holding an exclusive lock on a file > >> > while performing multiple read/write operations on it. > >> > > >> > The API intentionally reuses the already existing File abstraction as > >> > defined by [1] as we didn't want to have two different primitives for > >> > "a file". The File object has already been shipping in browsers for a > >> > while, so it's not an API that we expect to be able to make backwards > >> > incompatible changes to, which somewhat limits the design of the > >> > proposed filesystem API. > >> > > >> > Only adding two new abstractions was very intentional. We wanted to > >> > keep the API as small and simple as possible. So for example there is > >> > no abstraction for "a filesystem". Instead we simply let the root > >> > directory represent the filesystem. > >> > > >> > The API is entirely asynchronous since we don't expect implementations > >> > to be able to keep the whole filesystem in memory, and we don't want > >> > to force synchronous IO. But we've still tried to keep the API as > >> > friendly as possible. > >> > > >> > Detailed API: > >> > > >> > Apologies for using WebIDL here. I know it's not very popular with a > >> > lot of people on this list. And it's especially unfortunate in this > >> > API since the use of WebIDL to describe the API results in a lot of > >> > extra syntax in the description which doesn't actually affect the > >> > javascript that developers would write. > >> > > >> > Unfortunately I don't know of any other formal way of describing the > >> > API without spending tons of time typing up long descriptions of each > >> > function. > >> > > >> > partial interface Navigator { > >> > // This is what provides access to the sandboxed filesystem root. > >> > Promise<Directory> getFilesystem(optional FilesystemParameters > >> > parameters); > >> > }; > >> > > >> > interface Directory { > >> > readonly attribute DOMString name; > >> > > >> > Promise<File> createFile(DOMString path, > >> > CreateFileOptions options); > >> > Promise<Directory> createDirectory(DOMString path); > >> > > >> > Promise<(File or Directory)> get(DOMString path); > >> > > >> > AbortableProgressPromise<void> > >> > move((DOMString or File or Directory) path, > >> > (DOMString or Directory or DestinationDict) dest); > >> > AbortableProgressPromise<void> > >> > copy((DOMString or File or Directory) path, > >> > (DOMString or Directory or DestinationDict) dest); > >> > Promise<boolean> remove((DOMString or File or Directory) path); > >> > Promise<boolean> removeDeep((DOMString or File or Directory) path); > >> > > >> > Promise<FileHandle> openRead((DOMString or File) path); > >> > Promise<FileHandleWritable> openWrite((DOMString or File) path, > >> > OpenWriteOptions options); > >> > > >> > EventStream<(File or Directory)> enumerate(optional DOMString path); > >> > EventStream<File> enumerateDeep(optional DOMString path); > >> > }; > >> > > >> > interface FileHandle > >> > { > >> > readonly attribute FileOpenMode mode; > >> > readonly attribute boolean active; > >> > > >> > attribute long long? offset; > >> > > >> > Promise<File> getFile(); > >> > AbortableProgressPromise< > >> > ArrayBuffer> read(unsigned long long size); > >> > AbortableProgressPromise<DOMString> readText(unsigned long long > >> > size, optional DOMString encoding = "utf-8"); > >> > > >> > void abort(); > >> > }; > >> > > >> > interface FileHandleWritable : FileHandle > >> > { > >> > AbortableProgressPromise<void> write((DOMString or ArrayBuffer or > >> > ArrayBufferView or Blob) value); > >> > > >> > Promise<void> setSize(optional unsigned long long size); > >> > > >> > Promise<void> flush(); > >> > }; > >> > > >> > partial interface URL { > >> > static DOMString? getPersistentURL(File file); > >> > } > >> > > >> > > >> > // WebIDL cruft that's largely transparent > >> > enum StorageType { "temporary", "persistent" }; > >> > dictionary FilesystemParameters { > >> > StorageType storage = "temporary"; > >> > }; > >> > > >> > dictionary CreateFileOptions { > >> > CreateIfExistsMode ifExists = "fail"; > >> > (DOMString or Blob or ArrayBuffer or ArrayBufferView) data; > >> > }; > >> > > >> > dictionary OpenWriteOptions { > >> > OpenIfNotExistsMode ifNotExists = "create"; > >> > OpenIfExistsMode ifExists = "open"; > >> > }; > >> > > >> > enum CreateIfExistsMode { "replace", "fail" }; > >> > enum OpenIfExistsMode { "open", "fail" }; > >> > enum OpenIfNotExistsMode { "create", "fail" }; > >> > > >> > dictionary DestinationDict { > >> > Directory dir; > >> > DOMString name; > >> > }; > >> > > >> > enum FileOpenMode { "readonly", "readwrite" }; > >> > > >> > API Description: > >> > > >> > I won't go into the details about each function as it's hopefully > >> > mostly obvious. A few general comments: > >> > > >> > The functions on Directory that accept DOMString arguments for > >> > filenames allow names like "path/to/file.txt". If the function > creates a > >> > file, then it creates the intermediate directories. Such paths are > >> > always interpreted as relative to the directory itself, never relative > >> > to the root. > >> > > >> > We were thinking of *not* allowing paths that walk up the directory > >> > tree. So paths like "../foo", "..", "/foo/bar" or "foo/../bar" are not > >> > allowed. This to keep things simple and avoid security issues for the > >> > page. Attempting to use a path that contains a segment that is equal > >> > to ".." or ".", or any path which starts with "/" will cause an error. > >> > This way we can add support for this later if desired. > >> > > >> > Likewise, passing a File object to an operation of Directory where the > >> > File object isn't contained in that directory or its descendents also > >> > results in an error. > >> > > >> > One thing that is probably not obvious is how the FileHandle.location > >> > attribute works. This attribute is used by the read/readText/write > >> > functions to select where the read or write operation starts. When > >> > .read is called, it uses the current value of .location to determine > >> > where the reading starts. It then fires off an asynchronous read > >> > operation. It finally synchronously increases .location by the amount > >> > of the 'size' argument before returning. Same thing for .write() and > >> > .readText(). > >> > > >> > This means that the caller can simply set .location and then fire off > >> > multiple read or write operations which automatically will happen > >> > staggered in the file. It also means that the caller can set the > >> > location for next operation by simply setting .location, or can check > >> > the current location by simply getting .location. > >> > > >> > Setting .offset to null means "go to the end". This is why there is no > >> > openAppend function. Calling openWrite and then setting .offset to > >> > null before writing results in an append. > >> > > >> > Note that getting or setting .offset does not need to synchronously > >> > call seek, or do any IO operations, in the implementation. Instead the > >> > implementation simply tracks .offset in the API implementation. > >> > Whenever a read or write operation is scheduled, the current .offset > >> > is sent along with the operation information to the IO thread and the > >> > seek can happen there. Many times the implementation can optimize out > >> > the seek entirely. > >> > > >> > The FileHandle class automatically closes itself as soon as the page > >> > stops posting further calls to .read/.readBinary/.write to it. This > >> > happens once the last Promise returned from one of those operations > >> > has been resolved, without further calls to .read/.readBinary/.write > >> > having happened. This is similar to IDB transactions, though obviously > >> > there are no transactional semantics here. I.e. there is no way to > >> > roll back any changes. > >> > > >> > Open Questions: > >> > > >> > There are a few things that we did have disagreements on and which > >> > would be worth debating. > >> > > >> > Is the setup around the FileHandle.offset attribute a good idea? Some > >> > people found it confusingly different from posix. > >> > > >> > Can we get rid of the the non-recursive remove() function. The > >> > removeRecusive() function has the same capabilities, except that > >> > removeRecusive doesn't produce an error if you attempt to delete a > >> > non-empty directory. > >> > > >> > Can we get rid of the copy() function? Copy operations are certainly > >> > common to expose in UIs, but they can be easily implemented > >> > programmatically, so having it in the API isn't strictly needed. > >> > > >> > Should we add an openAppend function which always appends for all > >> > writes. Note that since FileHandle always holds an exclusive lock on > >> > the file, there is no risk that other actors will append to the file > >> > as long as a FileHandle is being used. > >> > > >> > Finally, should we remove the Directory abstraction? It's not needed > >> > given that you can directly interact with files in subdirectories. But > >> > it does provide the ability to do some capability management. I.e. > >> > holding a Directory object enables you to interact with the files in > >> > that directory and its subdirectories, but there is no way to reach > >> > out to a parent directory. Directory objects also is a familiar > >> > concept in filesystem APIs, so it seems natural to have it even though > >> > it's not strictly needed. > >> > > >> > [1] http://dev.w3.org/2006/webapi/FileAPI/ > >> > >> After all that, of course I forgot to include examples of what the API > >> looks like when used. > >> > >> // Save some downloaded data into a new file: > >> navigator.getFilesystem().then(function(root) { > >> root.createFile("myfile.txt", { data: xhr.response }); > >> }); > >> > >> // Append 5 bytes to the end of a large existing file: > >> navigator.getFilesystem().then(function(root) { > >> return root.openWrite("largefile.dat"); > >> }).then(function(handle) { > >> handle.offset = null; > >> return handle.write(new Uint8Array([1, 1, 2, 3, 5])); > >> }); > >> > >> // Increase the 100th byte in large existing file: > >> var fileHandle; > >> navigator.getFilesystem().then(function(root) { > >> return root.openWrite("dir/highscores"); > >> }).then(function(handle) { > >> fileHandle = handle; > >> fileHandle.offset = 100; > >> return fileHandle.read(1); > >> }).then(function(buffer) { > >> assert(buffer.byteLength === 1); > >> var view = new Uint8Array(buffer); > >> view[0]++; > >> fileHandle.location--; > >> return handle.write(buffer); > >> }); > >> > >> / Jonas > >> > > > > > > > > I didn't see any rationale that explains the decision to hang this off > the > > navigator object, unless there is a definite reason, then perhaps this > > should be it's own [[Global]] object? I apologize for sounding like a > broken > > record, but the "navigator" has nothing to do with the File System. > > > > partial interface Window { > > static FileSystem; > > } > > > > interface FileSystem { > > Promise<Directory> get(optional FilesystemParameters parameters); > > } > > > > Everything else could stay as-is. The examples then look like: > > > > (I used "get" in the IDL, but will use "request" in the examples, because > > I'm using my imagination and I think it looks nice) > > Jonas, thanks for the quick response > o_O > :) > > Your colorful imagination makes me not understand if you are proposing > that the function should be called "get", or "request". Or if you are > proposing that either should work. Or if you are proposing that there > is some other magic going on. > Nope, no magic. I was only trying to share "out loud" my desire to feel out which expresses the program intent more clearly. Apologies that I made it confusing :) > > Put another way, what does the last "it" refer to above? > I meant that, subjectively, I think the word "request" nicely expresses the intention: I want to request the filesystem and then do something with it. The problem I had, while writing that response, is that I kept thinking of "get" as a synchronous operation, for example a Map object, eg. var value = map.get(key); value; // 42. Of course this isn't a synchronous operation that's being designed, so I wonder if "get" is inappropriately being co-opted from a synchronous world into an asynchronous world. I'll also gladly accept that I'm over thinking it. > > Other than that I don't feel strongly. I'm not terribly excited to > introduce a FileSystem interface, even if it is one that only has a > single static function. But if people generally feel that that is > better then I can live with that. > Well, it was really meant as a stepping stone to the last example ;) > > The reason we tend to hang things off of Navigator these days is that > adding things to the global scope always runs the risk of name > collisions with existing content. > > > Or better yet, FileSystem is a constructor that produces FileSystem > objects > > that have a "get" or "request" method, now it's a reusable object... > > > > > > var fs = new FileSystem(); > > > > // Save some downloaded data into a new file: > > fs.request().then(function(root) { > > root.createFile("myfile.txt", { data: xhr.response }); > > }); > > (new Filesystem()).request() looks less nice to me than > Filesystem.request(). But again, I can live with either. > I completely agree. > > > I don't know how set in stone the naming is, but you might also consider > > reviewing some prior art (http://nodejs.org/api/fs.html) for method > names > > call signatures. > > Nothing is set in stone at this point. > Good to know! Rick
Received on Friday, 9 August 2013 23:35:45 UTC