[whatwg] Uploading directories of files

This thread (of which some especially salient points are included below) 
requested the addition of a feature or the codifying of a convention for 
uploading directory tree structures in <input type=file>.

I think that including relative directory paths with uploads is a quite 
reasonable feature. However, I don't think we really yet have good 
experience with what the right conventions should be. I would recommend 
that browser vendors experiment with this feature and report back their 
findings, so that we can converge on a proven convention.


On Thu, 10 Dec 2009, Ian Fette (?~B??~B??~C??~C~U?~B??~C~C?~C~F?~B?) wrote:
>
> Many sites allow you to upload multiple files, often images. HTML5 
> allows this via <input type="file" multiple>. This works well when your 
> files are all in one folder, but it may often be the case that files are 
> spread across sub-folders, and in this case you have to do multiple 
> transactions (or multiple <input type=file multiple> tags, which is just 
> awkward) to upload your files.
> 
> PROPOSAL: Allow a UA to recursively select and upload a directory of 
> files. How the UA chooses to modify the file picker dialog is outside 
> the scope of this spec, but for the sake of argument, assume that the UA 
> lets you pick a folder and say "upload all". Allow the UA to upload the 
> files in the folder, with the subdirectories included in the filename 
> with a directory separator.
> 
> E.g. assume I have:
> 
> C:\users\ian\a\b\1.jpg
> C:\users\ian\a\b\2.jpg
> C:\users\ian\a\c\3.jpg
> 
> If the user chooses "a", the UA should be allowed to send all three 
> files with filenames:
> 
> "a/b/1.jpg"
> "a/b/2.jpg"
> "a/c/3.jpg"
> 
> as it would for the existing <input type=file multiple> implementation, 
> with the addition of the directories and path separators (not full path, 
> just the directory the user chose and sub-paths.)

On Thu, 10 Dec 2009, Jonas Sicking wrote:
> 
> I don't think there is anything in the spec preventing you from doing 
> this right now. The fact that only files in the same folder can be 
> selected is a limitation in the implementation, not a limitation in the 
> spec.
> 
> The spec does require that only the leaf name, without any paths, are 
> submitted. Is that a problem?
> 
> I guess I'd be ok with changing the spec to allow more of the path to be 
> exposed. However that would mean that there is a mismatch between what 
> name is submitted and what name you'd get from input.files[n].name.

On Thu, 10 Dec 2009, Ian Fette (?~B??~B??~C??~C~U?~B??~C~C?~C~F?~B?) 
wrote:
> 
> I think that the notion of allowing more of the path to be exposed and 
> reconciling that with .name is where the problem lies, and would like to 
> figure out if we could resolve that. I think that there is a case to be 
> made for including the paths -- e.g. if I'm uploading photos to flickr, 
> picasa, or facebook, I may have already organized them locally, there's 
> no reason that I shouldn't be able to maintain that structure when I 
> upload to the web application. The question is then how that gets 
> reconciled with input.files[n].name -- I would think it preferable if 
> .name also were allowed to contain that extra information -- currently 
> we say "The name of the file. There are numerous file name variations on 
> different systems; this is merely the name of the file, without path 
> information.". I guess I would propose that be changed to "The name of 
> the file. There are numerous file name variations on different systems; 
> this is merely the name of the file. If the user agent allows for files 
> from multiple directories to be selected and included in a single 
> FileList, path information may be included to distinguish between the 
> files, provided that such path information SHOULD NOT include 
> information about any path components that are common to all of the 
> Files in the FileList."

On Thu, 10 Dec 2009, Jonas Sicking wrote:
> 
> If we're going to expose a full or partial path, then I think we should 
> do that separately from the .name property. I'd rather keep the .name 
> strictly be the leaf name.

On Fri, 11 Dec 2009, Markus Ernst wrote:
> 
> If I understand you correctly, this would lead to differences in file 
> names based on the UA, and even based on the folder that the user 
> actually chose to upload. See your example:
> 
> C:\users\ian\a\b\1.jpg
> C:\users\ian\a\b\2.jpg
> C:\users\ian\a\c\3.jpg
> 
> 1. The user anually selects files 1.jpg and 2.jpg in directory b. The 
> resulting filename of the first file is "1.jpg".
> 
> 2. Tho other day the user does an update, but this time selects 
> directory b and does "upload all". Resulting filename: "b/1.jpg".
> 
> 3. For the next update the user wants to easily upload all 3 files, 
> which results in: "a/b/1.jpg".
> 
> 4. Then the same action is done from another computer with a different 
> UA, the file might again be named "1.jpg".

On Fri, 11 Dec 2009, Jeremy Orlow wrote:
> 
> Personally, I don't think the case Markus pointed out is at all a show 
> stopper.  In the case of images, the server could easily recognize and 
> reconcile duplicates (by hashing them and looking for duplicate hashes 
> or something).  If the image has been tweaked some in the mean time, the 
> EXIF data can help.  And so on....this seems like the type of thing 
> clever developers can work around.
> 
> But regardless.....I don't think you could argue that having _some_ path 
> information is worse than _none_, right?
> 
> I also agree with Jonas that if some path information is added, it might 
> be better to create a new property (other than .name) for it.
> 
> And, with or without that extra property, I think what Ian's suggesting 
> would be useful to users.

On Fri, 11 Dec 2009, Markus Ernst wrote:
> 
> Yes I see Anne's and your points. Anyway I don't see yet how to get 
> _useful_ path information, as the same file can be posted as /a/b/1.jpg, 
> and at the next occasion as 1.jpg or /b/1.jpg, just based on where in 
> the upload dialog you did make the start point.
> 
> Relying on information contained in the uploaded file does not seem to 
> make sense to me, as you might want to upload a new file with the same 
> name in order to replace the old one.

On Fri, 11 Dec 2009, Jeremy Orlow wrote:
> 
> The information in the path could be seen as a hint that may or may not 
> be provided.  I feel like it'd be difficult security wise to guarantee 
> that the hint will be there and/or consistent from upload to upload.  
> But, once again, some hint is better than none, right?  If you as a web 
> developer don't think it's useful, you can ignore it, right?

On Sun, 13 Dec 2009, Jonas Sicking wrote:
> 
> The only change needed as far as I can tell is to say that *if* the File 
> objects contain any path information, that that path information is 
> included as part of the filename when the data is submitted.

The spec doesn't actually disallow that currently; it just refers to "the 
name" without saying whether a name can include / or \ characters.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Friday, 12 February 2010 00:06:18 UTC