Re: jar protocol

I've been looking at some of these alternatives for my RO Bundle
specification, which is basically a ZIP file.

http://purl.org/wf4ever/ro-bundle/2013-05-10/#absolute-uris

I have not yet decided which of these schemes would be used in my
approach, which is why the above contains these considerations.




I considered both the "zip as a folder" approach - which is nice if
you have access to the server and can do magic to actually serve the
ZIP file, but in the common case gives misleading 404 errors.

If the ZIP is on a domain out of your control, you also run the (quite
small) risk that when you mint http://example.com/bundle.zip/fred.jpeg
it might actually be a different resource.

If the URL to the original ZIP has a query parameter, you are in trouble.



The JAR scheme as an URI scheme is also not particularly
URI-compliant, it is not hierarchically as it does not have //, so you
can't formally resolve relative URIs like "../" within it (and in
fact, if you do, you easily climb outside the magic !/ marker) - even
java.net.URI gets this wrong because it's an "opaque" URI without //.


I actually like the widget URI scheme (which is why I am on this list)
- but of course it's not resolvable unless you happen to know which
ZIP file contains widget://8191dee8-0b8e-452d-8d64-7706a140185e/.

Internally in my application I actually 'cheat' and use an MD5 of the
URI to the ZIP file so that I get consistent widget URIs for the same
file - but this is a bit dangerous, as the md5 of
"file:///tmp/file.zip" would give the same widget URI at different
times and on different machines. (Using the SHA1 checksum of the ZIP
file itself is quite more reliable, but not an option if the file is
to be modified).



On 10 May 2013 14:54, Robin Berjon <robin@w3.org> wrote:
> Hi Brian,
>
>
> On 10/05/2013 15:32 , Brian Kardell wrote:
>>
>> Would it be possible (not suggesting this would be the  common story) to
>> reference a zipped asset directly via the full url, sans a link tag?
>
>
> Can you hash out a little bit more how this would work? I'm assuming you
> mean something like:
>
>   <img src='/bundle.zip/img/dahut.jpg'>
>
> Without any prior set up on the client to indicate that /bundle.zip is a
> bundle. This causes the browser to issue GET /bundle.zip/img/dahut.jpg
>
> At that point, the server can:
>
>   a) return a 404;
>   b) extract the image and return that;
>   c) return bundle.zip with some header information telling the browser that
> it's not an image but that the "/bundle.zip" part of the URL matched
> something else and it should look inside it for the rest of the path.
>
> Neither (a) nor (b) are very useful to us. (c) could be made to work, but
> it's not exactly elegant. The server would also have to know if the UA
> supports (c), and fall back to (b) if not, which means that some signalling
> needs to be made in the request. That's also not entirely nice (and it would
> have to happen on every request since the browser can't guess).
>
> It gets particularly nasty when you have this:
>
>   <img src='/bundle.zip/img/dahut.jpg'>
>   <img src='/bundle.zip/img/unicorn.jpg'>
>   <img src='/bundle.zip/img/chupacabra.jpg'>
>   <img src='/bundle.zip/img/robin-at-the-beach.jpg'>
>
> The chances are good that the browser would issue several of those requests
> before the first one returned with the information telling it to look in the
> bundle. That means it would return the bundle several times. Definitely a
> loss.
>
> Or did I misunderstand what you had in mind?
>
>
> --
> Robin Berjon - http://berjon.com/ - @robinberjon
>



--
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/

Received on Friday, 10 May 2013 15:19:56 UTC