Re: Request for feedback: Filesystem API

Hi,

I//had a little experience with a Tizen Web App recently where I could 
play with Webkit's file-system API (or so I thought? [1]) and was 
waiting to mature more before sending some feedback on the API... Let's 
say I'm happy there is a new proposal on the table :-)

Le 09/08/2013 23:02, Jonas Sicking a écrit :
> Over the past few months a few of us at mozilla, with input from a lot
> of other people, has been iterating on a filesystem API. The goal of
> this filesystem API is first and foremost to expose a sandboxed
> filesystem to webpages. This filesystem would be origin-specific and
> would not allow accessing the user's OS filesystem. This avoids a lot
> of the security concerns around filesystem APIs.
One interesting aspect of the WebKit API is specifying to the amount of 
storage your webpage (or webapp) requests. This is absent from your 
proposal. Is it on purpose? This would mean that storage amounts are 
dealt with when they occur instead of upfront.
I have no strong opinion either way, but feel it's an important question 
to address.

> API summary:
>
> The proposed API introduces two new abstractions: A Directory object
> which allows manipulating files and directories within it, and a
> FileHandle object which allows holding an exclusive lock on a file
> while performing multiple read/write operations on it.
One downside with the Webkit API is that a file might be used by a 
different process and accessing it would fail, but there was nothing 
useful that could be done to know when the file could be accessed again. 
The fact that the whole locking part is taken care of under the hood is 
an excellent feature.


> Apologies for using WebIDL here. I know it's not very popular with a
> lot of people on this list. And it's especially unfortunate in this
> API since the use of WebIDL to describe the API results in a lot of
> extra syntax in the description which doesn't actually affect the
> javascript that developers would write.
>
> Unfortunately I don't know of any other formal way of describing the
> API without spending tons of time typing up long descriptions of each
> function.
I've played with TypeScript and they have a very interesting interface 
language. You should check it out. Spec [3] Lots of examples [4][5].
Maybe a mix between WebIDL and TypeScript interfaces (some things can be 
expressed in WebIDL, but not in TypeScript interfaces AFAIK) might work 
well.

As a side note, I don't think people hate WebIDL. I think people hate 
how some spec authors feel that writing in WebIDL is enough to define a 
good API. The fact that you came to ask for feedback on the API shows 
that you're not in that category ;-)

> partial interface URL {
>    static DOMString? getPersistentURL(File file);
> }
File inherits from Blob [6] and there is already URL.createObjectURL 
that works for Blob (and File as a consequence).
Is this method doing anything more?

> // WebIDL cruft that's largely transparent
> enum StorageType { "temporary", "persistent" };
> dictionary FilesystemParameters {
>    StorageType storage = "temporary";
> };
Thanks for not polluting the global object with constants as the 
WebKit/Blink implementation does :-)

> API Description:
>
> I won't go into the details about each function as it's hopefully
> mostly obvious. A few general comments:
>
> The functions on Directory that accept DOMString arguments for
> filenames allow names like "path/to/file.txt". If the function creates a
> file, then it creates the intermediate directories. Such paths are
> always interpreted as relative to the directory itself, never relative
> to the root.
>
> We were thinking of *not* allowing paths that walk up the directory
> tree. So paths like "../foo", "..", "/foo/bar" or "foo/../bar" are not
> allowed. This to keep things simple and avoid security issues for the
> page. Attempting to use a path that contains a segment that is equal
> to ".." or ".", or any path which starts with "/" will cause an error.
> This way we can add support for this later if desired.
>
> Likewise, passing a File object to an operation of Directory where the
> File object isn't contained in that directory or its descendents also
> results in an error.
>
>
> Open Questions:
>
> Finally, should we remove the Directory abstraction? It's not needed
> given that you can directly interact with files in subdirectories. But
> it does provide the ability to do some capability management. I.e.
> holding a Directory object enables you to interact with the files in
> that directory and its subdirectories, but there is no way to reach
> out to a parent directory. Directory objects also is a familiar
> concept in filesystem APIs, so it seems natural to have it even though
> it's not strictly needed.
I'd like to share an experience in working with Tizen. In their doc, 
they list the different ways to store data [2]. Aside from the obselete 
WebSQL, we have:
- localStorage
- IndexedDB
- FileSystem (the Webkit one)
(they put app cache in this category, but I don't really see it as a 
storage mechanism)

It left me thinking that it's a lot of different ways to store 
information, but they serve different purposes: localStorage is 
key(string)/value(string), IndexedDB is useful to store more data 
structures. But what is FileSystem?
To a first approximation, a FileSystem is a key(string)/value(binary 
data) storage system (but the fine grain access and async to the value 
makes it better than localStorage when that matters). Keys are strings 
(where '/' has a particular semantics). In OS FileSystems, this first 
approximation is wrong because specific directories can have different 
rights (rwx) assigned. But this isn't a feature web apps needs (at least 
I haven't seen this need expressed when it comes to data storage).

In the Tizen application, I wrote an abstraction on top of the 
FileSystem to make it an async key(string)/value(Blob) storage, because 
that's what we really needed (it had the same interface than the async 
abstraction of key/value storage I wrote on top of localStorage, which 
was awesome)

There are lots of point above (dealing with relative '..' paths, 
intermediate directories, making sure a file is within a directory 
subtree, etc.) that relate to Directory and that would plain disappear 
if the Directory abstraction was removed. Since it's not really needed 
from the data storage perspective, I'd be in favor of removing it.

One argument in favor of Directory I have read is about handing off only 
a directory (instead of the whole filesystem) to partially trusted code, 
but that could be solved if the FileSystem interface provides something 
like:
     var prefixedFileSystem = fs.createPrefixedSubFileSystem(prefix);
Worst case, this is something that can be easily implemented as a 
library. I don't think we need a Directory abstration and all the 
complications that come along to solve that particular use case. The 
people who need this sort of compartimentation will figure it out.

> However it is expected that this API will eventually also be used for
> accessing real filesystems eventually
In my demonstration above, I considered the file system purely from the 
data storage persective and concluded that the Directory abstraction 
isn't necessary.
But "accessing real filesystems" is a very different use case than data 
storage. Interacting with a real filesystem means (or can mean, 
depending on the level of granularity you want to go to) taking care of 
things like per-directory rights, etc.

If the eventual goal is to get something as precise as the Unix API for 
a file system API, I'd agree with Rick that the Node.js fs API is 
something to look at.
If we just want a better key/value storage, a simpler API (no Directory 
abstraction) might just be good enough.

It isn't entirely clear to me yet which one we want. Do we want to solve 
both use cases with the same API? Maybe the DataStore API [7] can be the 
better "flat" key/value API and the FileSystem be a low-level FileSystem 
API (and needs Directory)?

David

[1] http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0251.html
[2] 
https://developer.tizen.org/help/index.jsp?topic=%2Forg.tizen.web.w3c.apireference%2Fw3c_api.html
[3] 
http://www.typescriptlang.org/Content/TypeScript%20Language%20Specification.pdf
[4] https://github.com/borisyankov/DefinitelyTyped/
[5] https://typescript.codeplex.com/sourcecontrol/latest#typings/lib.d.ts
[6] http://www.w3.org/TR/FileAPI/#file
[7] https://wiki.mozilla.org/WebAPI/DataStore

Received on Saturday, 10 August 2013 21:54:59 UTC