[whatwg] Web Archives

On 11-Apr-07, at 9:35 PM, Michael A. Puls II wrote:

> On 4/11/07, Lachlan Hunt <lachlan.hunt at lachy.id.au> wrote:
>> Michael A. Puls II wrote:
>> > It's a really good way to archive, but IE won't handle it and most
>> > plug-ins don't accept data URIs, so there are problems with that
>> > use-case. (unless browsers can help with that in a secure way.)
>> >
>> > I made a suggestion about this on the Opera forums a while ago when
>> > Opera didn't even support .mht.
>> > <http://my.opera.com/community/forums/topic.dml?id=72718>
>> > (The actual working example links are broken, but the idea was..)
>>
>> So because data URIs are unsupported by IE and MHT isn't supported by
>> some others, you propose this new feature which is equally  
>> unsupported
>> all browsers?
>
> If IE supported data URIs, I still don't think HTML + data URIs would
> be the best format for archiving. (Just saying that if IE did, we
> could use HTML + data URIs now for *some* archiving situations, but
> there'd still be problems with plug-ins and data bloat from encoding
> the resources.)
>
> If every browser supports .mht, I still don't think it's the best
> format for archiving.
>
> I suppose if I could (outside of a browser) select index.html and all
> its files, right-click and choose "Generate .mht file from selection"
> and do the opposite of "generate files from .mht file" to get the
> original content  back, it might be better and feel more like a zip
> archive. However, even then, I don't think generating a mail message,
> possibly using quoted printable and a  bunch oh headers is the best
> way to archive an html page and its content.
>
> What I do think is that the mozila archive format or something like
> the widget packaging Karl mentioned
> <http://www.w3.org/TR/WAPF-REQ/#requirements_packaging>  would be a
> better format for archiving.
>
> So, yes, in that I'm suggesting something else that may not be
> supported much or at all yet, but not  because of current support with
> other formats.

I agree.  The point is to get consistency between vendors, not to  
wait until they're already consistent.  If we waited for everyone to  
support the same thing, nothing then would be supported.

Thank you Karl for pointing out the Widgets draft.  I can appreciate  
the similarities, but I'm not convinced that that implementation  
would work in this case (as it reads today).  While Microsoft  
Sidebar, Google Desktop, Apple Dashboard or any other widget runtime  
environment would recognize the file and try to open it, there is  
nothing in the draft that promotes Internet Explorer, Firefox, Safari  
or any other browser to do the same.  For example, Safari can't even  
be used to open a .wdgt file.

For an archived website or an HTML-based document, you would want to  
be able to rely on the user having a viewer on his or her system that  
would recognize the file.  In most cases, other than a widget, that  
viewer would be their browser.  Maybe there needs to be a draft for  
"web documents" that is virtually identical to the widget draft, but  
aimed at browser runtime environments and with a different file  
extension?  Any thoughts?

- Tyler

Received on Thursday, 12 April 2007 08:28:47 UTC