- From: Glenn Maynard <glenn@zewt.org>
- Date: Wed, 16 Nov 2011 13:21:08 -0500
On Wed, Nov 16, 2011 at 3:42 AM, Jonas Sicking <jonas at sicking.cc> wrote: > > That requires a full directory traversal in advance to find all of the > > files, though; the tree could be very large. > > You need to do that anyway to implement the .files attribute, no? > .files shouldn't recursively include all files inside directories. (If you actually select tens of thousands of files and drag them, then yes, but in most cases when you have that many files, they're split into directories and you don't normally drag them individually.) On Wed, Nov 16, 2011 at 9:59 AM, Kinuko Yasuda <kinuko at chromium.org> wrote: > The unsandboxed storage and actual data doesn't belong to origin, but > the 'origin-specific' concept can be applied to the filesystem > namespace. > > I haven't thought about workers cases deeply yet, but am thinking that > we should prohibit access to the dropped folders from the other pages > than the one that received the drop event. Access to a file should just be limited by whoever has an Entry object pointing at it. The Entry object is essentially a token granting access to its associated file(s). > As for the entry URLs I'm planning to make the URLs to the dropped entries > and the filesystem > namespace (that only contains the dropped files) expire when the page > goes away, hoping this would largely simplify the lifetime and > security issues. > I don't think it's possible to do this correctly, because URLs created with toURL have no equivalent to revokeObjectURL. A long-running page has no way to avoid "leaking" these references until the page exits. Adding a revoke method for toURL would essentially turn it into URL.createObjectURL. Needing to revoke URLs when dealing with worker communication also makes it very hard for users to get it right. For example, suppose a Window sends a toURL-generated URL to a Worker. How do you ensure that the URL is revoked after the worker has received it and finished converting it back to an Entry? The Worker might be killed (eg. due to CPU quotas) at any time, making avoiding resource leaks very hard. These are just the usual problems with manual resource management, which should be avoided if at all possible. We already have a mechanism that cleanly avoids all of this, with structured clone and File. > Off-hand, the main issue that directly affects reading is that most > > non-Windows filesystems can store filenames which can't be represented > by a > > DOMString, such as invalid codepoints (most commonly mismatched > encodings). > > How do they appear in File.name in existing .files approach? > I don't have a Linux browser to check. I'm guessing it won't inform us much here, since that didn't have to worry about general file access. A naive solution in filesystem approach would be silently ignoring > such files (probably bad) or having in-memory path mapping (would be > slightly better). For limited read-only drag-and-drop cases we > wouldn't need to think about remapping and the mapping could just go > away when the page goes away, so hopefully implementing such mapping > wouldn't be that hard. > There are probably some cases that we'll just have to accept will never work perfectly, and design with that in mind. To take a common case, suppose a script does the following, a commonplace method for safe file overwriting (relatively; the needed flush operations don't exist here): 1. Create a file with the name filename + ".new". 2. Write the new file contents to the file. 3. Rename filename + ".new" to filename, overwriting the original file. This is a useful case: it's real-world--I've done this countless times--and it's a case where unrepresentable filenames affects both reading and writing, plus the auxiliary operation of renaming. I suppose the mapping approach could work here. Associate the mapping with the DirectoryEntry containing it, from invalid filenames to generated filenames. Then, if the invalid filename is "X", and the DOMString mapping is "MAPPING1", then this would first create the literal filename "MAPPING1.new", followed by renaming it to the original "invalid" filename "X". (In particular, though, I think it should not be possible to create *new* garbage filenames on people's systems, that didn't exist to begin with. That is, it should map to the filenames that really exist, not just string escaping.) This is complex, though, and leads to new questions, like how long the mappings last if the underlying file is deleted. As a data point, note that most Windows applications are unable to access files whose filenames can't be represented in the current ANSI codepage. That is, if you're on a US English system, you can't access filenames with Japanese in them. (Unicode applications can, but tons of applications in Windows aren't Unicode; Windows has never made it simple to support Unicode.) If users find that reasonable, it might not be worth all this for the even rarer case of illegal codepoints in Linux. Yup, writing side would have tougher issues, and that's why I started > this proposal only with read-only scenarios. (I agree that it'd be > good to give another thought about unsandboxed writing cases though) > For what it's worth, I think the only sane approach here is an isolated break from attempting to make everything interoperable, and allow the platform's limitations to be visible. (That is, fail file creation if the path depth or filename length is too long on the platform; succeed with file creation even if it would fail on a different platform, and so on.) I think this is just inherent to allowing this sort of access to real filesystems, and trying to avoid it just causes other, stranger problems. (For example, if you prevent creating filenames in Linux which are illegal in Windows, then things get strange if an "illegal" filename already exists on a filesystem where it's not actually disallowed.) On Wed, Nov 16, 2011 at 12:01 PM, Eric U <ericu at google.com> wrote: > While the URL format for non-sandboxed files has yet to be worked out, > I think we need toURL to work no matter where the file comes from. > It's already the case that an Entry can expire if the underlying file > is deleted or moved; But there's no revocation mechanism for toURL URLs. Also, if toURL URLs to non-sandboxed storage expires with the context it was created in (which it would have to, I think), it loses a whole category of use cases covered by structured clone: the ability to persist an access token. For example, the spec allows storing a File within a History state. That allows history navigation to restore its state properly: if the user opened a local picture into an image viewer app, navigating through history can correctly show the files in older history states, and even restore correctly through browser restarts and session restores. The same should apply to Entry and DirectoryEntry. (Nobody implements this yet, as far as I know, but I hope it'll happen eventually. It's a limitation today, and it'll become a more annoying one as local file access mechanisms like this one are fleshed out.) Also, if non-sandboxed toURL URLs are same-origin only, then that also loses functionality that structured cloning allows: using Web Messaging to pass an access token to a page with a different origin. (This is much safer than allowing cross-origin use of the URLs, since it's far easier to accidentally expose a URL string than to accidentally transfer an object.) File API has already solved all of this by using structured clone. I think it makes a lot of sense to follow its lead. -- Glenn Maynard
Received on Wednesday, 16 November 2011 10:21:08 UTC