RE: Updates to File API from Adrian Bateman on 2010-06-22 (public-webapps@w3.org from April to June 2010)

From: Adrian Bateman <adrianba@microsoft.com>
Date: Tue, 22 Jun 2010 15:44:32 +0000
To: Jonas Sicking <jonas@sicking.cc>
CC: "arun@mozilla.com" <arun@mozilla.com>, Jian Li <jianli@chromium.org>, "Web Applications Working Group WG" <public-webapps@w3.org>, public-device-apis <public-device-apis@w3.org>
Message-ID: <104E6B5B6535E849970CDFBB1C5216EB2F26F448@TK5EX14MBXC140.redmond.corp.microsoft.>

On Friday, June 11, 2010 11:18 AM, Jonas Sicking wrote:
> On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sicking <jonas@sicking.cc> wrote:
> > On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman <adrianba@microsoft.com>
> >> It's not clear to me the benefit of encoding the origin into the URL. Do
> >> we expect script to parse out the origin and use it? Even in a multi-process
> >> architecture there's presumably some central store of issued URLs which will
> >> need to store origin information as well as other things?
> >
> > The one advantage I can see is that putting the scheme into the URL
> > allows the *implementation* to deduce the origin by simply looking at
> > the URL-scheme. This avoids having to do a (potentially cross-process)
> > lookup to get the origin.
> >
> > This could be useful for APIs which have to synchronously determine
> > the origin of a given URL in order to throw an exception on an
> > attempted cross-origin access. For example an XMLHttpRequest Level 1
> > implementation needs to synchronously determine if it should make a
> > call to .open(...) throw or not based on the origin of the passed in
> > URL.
> >
> > However I'm not sure if this is a problem in practice or not. It's
> > entierly possible that the web platform is littered with situations
> > where you need to do synchronous communication with whichever thread
> > the networking code runs on.
> >
> > Firefox is still in the process of going multi-process, so I'll defer
> > to other browsers with more experience in this area.
> 
> Oh, and I should add that the implementation will of course still have
> to check once a url is loaded that the origin in the url matches the
> origin in whatever map is used to map urls to resources. I.e. if the
> implementation has handed out a url like:
> 
> filedata:sheep.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752
> 
> and script changes that to:
> 
> filedata:wolf.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752
> 
> then attempting to load the latter url should result in a 404 or similar.

Since the origin requires scheme as well as hostname/port it seems like we'll
end up with some encoding or parsing complexity by following this approach. Robin
gave good reasons for not allowing user agents to encode data into the URL
and I'm not convinced that including origin for this particular case isn't
a premature optimisation. At what point will we find other data that's
convenient to have encoded in the URL?

I think it makes more sense for the URL to be opaque and let user agents figure
out the optimal way of implementing origin and other checks.

Cheers,

Adrian.

Received on Tuesday, 22 June 2010 15:46:13 UTC