[whatwg] Web Documents off the Web (was Web Archives)

On 4/17/07, Thomas Broyer <t.broyer at gmail.com> wrote:

>
> I hope you're talking about GZip or BZip2, not application/zip?
>
>
Doesn't matter to me - I just figure some sort of compression would help,
and it would probably help if that compression was supported by browsers, so
gzip sounds right.

The problem is the current browser support for .mht and support for
> generating/loading .mht files with binary attachments.


Which appears to be halfway there in the major browsers.

The method for reading Web pages off line is subscription, not downloading.
> Your browser should support subscription.  Enable it for your favorite
> pages
> and you are done.


Maybe in your browser, but not store on disk apart from your browser, and
not to transfer to someone else (via email, web download, p2p) as a
self-contained document (e.g. a powerpoint-style presentation)

.mht looks good because it can retain original URLs of online resources,
it's fairly human readable and debuggable, and it already has a standard and
some support.  An HTML document can reference its external parts (images,
css) via either cid: URIs or the original HTTP URL as long as all the right
Content-Location headers are present.

a single compressed file (.zip?) looks good because of the size and how
easily it can be unpacked and used with a browser that doesn't natively
support the single compressed file.  I don't know what URI scheme an HTML
document would use to reference images and CSS.

The only other thing I can think of is an HTML document that uses data: URIs
to reference its external parts (e.g. a CSS file) which also use data URIs
to reference their external parts (e.g. background images).

What place does HTML5 have in specifying one of these options as a standard
archive format?  Any?  A non-normative section on archives?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20070417/28030394/attachment.htm>

Received on Tuesday, 17 April 2007 06:39:19 UTC