- From: Gordon P. Hemsley <me@gphemsley.org>
- Date: Wed, 28 Aug 2013 10:36:14 -0400
- To: whatwg@lists.whatwg.org
On 8/28/13 9:32 AM, Anne van Kesteren wrote: > We have thought of three approaches for zip URL design thus far: > > * Using a sub-scheme (zip) with a zip-path (after !): > zip:http://www.example.org/zip!image.gif > * Introducing a zip-path (after %!): http://www.example.org/zip%!image.gif > * Using media fragments: http://www.example.org/zip#path=image.gif > > High-level drawbacks: > > * Sub-scheme: requires changing the URL syntax with both sub-scheme > and zip-path. > * Zip-path: requires changing the URL syntax. > * Fragments: fail to work well for URLs relative to a zip archive. > > Fragments are conceptually the cleanest as the only part of a URL > that's supposed to depend on the Content-Type is the fragment. > However, if you want to link to an ID inside an HTML resource you'd > have to do #path=test.html&id=test which would require adding > knowledge to the HTML resource that it is contained in a zip archive > and have special processing based on that. And not just HTML, same > goes for CSS or JavaScript. > > I'm not sure we need to consider sub-scheme if zip-path can work as > it's more complex and not very well thought out. E.g. imagine > view-source:zip:http://www.example.org/zip!test.html. (I hope we never > need to standardize view-source and that it can be restricted to the > address bar in browsers.) > > zip-path makes zip archive packaging by far the easiest. If we use %! > as separator that would cause a network error in some existing > browsers (due to an illegal %), which means it's extensible there, > though not backwards compatible. > > We'd adjust the URL parser to build a zip-path once %! is encountered. > And relative URLs would first look if there's a zip-path and work > against that, and use path otherwise. > > Fetching would always use the path. If there's a zip-path and the > returned resource is not a zip archive it would cause a network error. > > As for nested zip archives. Andrea suggested we should support this, > but that would require zip-path to be a sequence of paths. I think we > never went to allow relative URLs to escape the top-most zip archive. > But I suppose we could support in a way that > > %!test.zip!test.html > > goes one level deeper. And "../image.gif" in test.html looks in the > enclosing zip. And "../../image.gif" in test.html looks in the > enclosing zip as well because it cannot ever be relative to the path, > only the zip-path. > As the following URLs suggest, the %! (or %-anything) will likely not work for ZIP files generated by a script using the query portion of the URL, as the path information will be subsumed into the last value without causing a network error: http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1%!example.png http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1%/example.png http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1?example.png (And feel free to use that script to try out any other combos.) However, since fragments (i.e. anything beginning with '#') are already not sent to the server, what if you modified the URL parser to use a special hash-prefix combo that indicates the path? Then you could avoid the problem of having to make documents aware of the fact that they're in a ZIP because the hash-prefix combo would come before the plain hash which holds the ID. So, for example: http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1#/example.html#middle Then you could also take the opportunity to spec the #! prefix (and other hash-combo prefixes) that is used by a lot of sites nowadays. -- Gordon P. Hemsley me@gphemsley.org http://gphemsley.org/
Received on Wednesday, 28 August 2013 14:36:49 UTC