Re: Packaging on the Web from Alex Russell on 2014-02-05 (www-tag@w3.org from February 2014)

From: Alex Russell <slightlyoff@google.com>
Date: Tue, 4 Feb 2014 22:12:18 -0800
To: Jeni Tennison <jeni@jenitennison.com>
Cc: "www-tag@w3.org List" <www-tag@w3.org>
Message-ID: <CANr5HFXL_fx+VLbXD_9pT6S_AWcqZvqU=JuuH-cTeaaUeqKHEA@mail.gmail.com>
On Tue, Feb 4, 2014 at 1:22 AM, Jeni Tennison <jeni@jenitennison.com> wrote:

> Alex,
>
> Yes, it’s not as conceptually elegant a solution as I would like either.
> The more elegant approaches that I looked at had larger drawbacks, as I’ve
> documented. Can you think of another way of approaching this that I’ve
> missed?
>

Yehdua's proposal (changing URL parsing) avoids the issue by conflating
addressing with package location.


> I’m not convinced that having to make a request before making another
> request for a package is a fundamental problem. I know that the point of
> the packages is to minimise the number of requests that are made, but the
> BBC home page, for example, involves 110 requests; doing 2 rather than 110
> seems like a win.
>

Again, please see the discussion of pre-parse scanners. It's unclear
(without some other signal, say a globbing pattern that's part of the
<link> to indicate what resources are in the package and which aren't) that
we will actually avoid the extra fetches in practice under the proposal.


> There is also the option for clients that understand packages to stop the
> initial transfer once they see a Link rel=package header or <link
> rel=“package”> tag in the HTML.
>
> Jeni
> --
> Jeni Tennison
> http://www.jenitennison.com/
>
> ------------------------------------------------------
> From: Alex Russell slightlyoff@google.com
> Reply: Alex Russell slightlyoff@google.com
> Date: 3 February 2014 at 21:46:20
> To: Jeni Tennison jeni@jenitennison.com
> Subject:  Re: Packaging on the Web
>
> >
> > On Mon, Feb 3, 2014 at 1:28 AM, Jeni Tennison
> > wrote:
> >
> > > Alex,
> > >
> > > What stops you from including the HTML, or an entire site, in
> > a package? I
> > > was envisioning that if you went to `http://example.org/`
> > then the HTML
> > > page you downloaded could have a `package` link that included
> > both that
> > > HTML page and all the rest of the HTML on the site (if you wanted).
> > >
> >
> > That's exactly the chicken/egg scenario. I want the "html I downloaded"
> > to
> > come from the package itself.
> >
> >
> > > With the protocol that I’m suggesting, you do need to get hold
> > of that
> > > initial HTML to work out where the package is in the first place,
> > but I
> > > couldn’t work out an alternative mechanism for that.
> > >
> > > Jeni
> > > --
> > > Jeni Tennison
> > > http://www.jenitennison.com/
> > >
> > > ------------------------------------------------------
> > > From: Alex Russell slightlyoff@google.com
> > > Reply: Alex Russell slightlyoff@google.com
> > > Date: 3 February 2014 at 01:17:17
> > > To: Jeni Tennison jeni@jenitennison.com
> > > Subject: Re: Packaging on the Web
> > >
> > > >
> > > > On Sun, Feb 2, 2014 at 9:36 AM, Jeni Tennison
> > > > wrote:
> > > >
> > > > > Alex,
> > > > >
> > > > > > First, thanks for capturing what seems to be broad consensus
> > > > > > on the packaging format (multi-part mime). Seems great!
> > > > >
> > > > > I tried to capture the rationale for the multipart type for
> > packaging.
> > > > The
> > > > > one massive disadvantage as far as I’m concerned is the necessity
> > > > for the
> > > > > boundary parameter in the content type.
> > > >
> > > >
> > > > It seems a new content-type is needed for security anyhow,
> > no?
> > > >
> > > >
> > > > > A new type that had the same syntax as a multipart type but had
> > > > a
> > > > > sniffable boundary (ie started with --boundary) might be
> > better
> > > > than using
> > > > > a multipart/* content type.
> > > >
> > > >
> > > > ISTM that we have a chance to repair that if we wish. New tools
> > will
> > > > be
> > > > needed to create packages of this type in any case.
> > > >
> > > >
> > > > > > I'm intrigued by the way you're handling base URL resolution
> > > > for relative
> > > > > > URLs. Do you imagine that base URL metadata will be required
> > > > inside
> > > > > > packages? And if you move a package off-origin, but it is
> > > CORS-fetched,
> > > > > > does that enable a third-party to "front" for a second-party
> > > > origin? How
> > > > > > does the serving URL get matched/merged with the embedded
> > > > base
> > > > > > URL? And if the base URL metadata isn't required, what happens?
> > > > >
> > > > > Good questions. I wasn’t imagining the base URL would be required
> > > > inside
> > > > > packages, but would be taken as the location from which the
> > package
> > > > was
> > > > > fetched.
> > > > >
> > > >
> > > > I see. I think I got confused by the phrase:
> > > >
> > > > Content from the cache will run with a base URL supplied within
> > > > the package.
> > > >
> > > >
> > > > This, then, would be the the locations from which the package
> > > > was fetched?
> > > >
> > > >
> > > > > Since the Content-Location URLs have to be absolute-path-relative
> > > > or
> > > > > path-relative (ie can’t contain a domain name), you can’t
> > get
> > > > content from
> > > > > one origin pretending to be from another origin. Obviously
> > > > that means if
> > > > > you host a package you have to be careful about what it contains,
> > > > but
> > > > > that’s true of practically any web content.
> > > >
> > > >
> > > > Makes a lot more sense. Thanks!
> > > >
> > > >
> > > > > > I'm curious about the use of fragments. Yehdua covered this
> > > > pretty
> > > > > > thoroughly in the constraints he previously outlined when
> > > > we
> > > > > > went over this in Boston:
> > > > > >
> > > > > > https://gist.github.com/wycats/220039304b053b3eedd0
> > > > > >
> > > > > > Fragments aren't sent to the server and so don't have any
> > meaningful
> > > > > > server-based fallback or service-worker polyfill story.
> > > > That
> > > > > > seems pretty fundamental. Is there something in the URL
> > format
> > > > proposal
> > > > > that
> > > > > > I'm missing?
> > > > >
> > > > > I’m not sure; it depends what you’re curious about. My assumption
> > > > is that,
> > > > > for backwards compatibility with clients that don’t understand
> > > > packages,
> > > > > the files in a package would all be accessible from the server
> > > > directly as
> > > > > well as through the package. In other words, if a package at
> > > > > `/package.pack` contains `/index.html` and `/images/icon.png`
> > > > then
> > > > > `/index.html` and `/images/icon.png` will also be available
> > > > directly on the
> > > > > server.
> > > > >
> > > >
> > > > I take it you're trying to avoid a world where I ever write something
> > > > like:
> > > >
> > > >
> > > >
> > > > And instead would recommend that webdevs write:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Is that right?
> > > >
> > > > If so, I think there are interactions with browser optimizations
> > > > to
> > > > consider. It's common for browser engines to "pre scan" sections
> > > > of
> > > > document streams before parsing to start requesting the resources
> > > > they
> > > > contain. This is a big win when parsing might be held up by
> > > >
> > >
> >
>
>
Received on Wednesday, 5 February 2014 06:13:15 UTC