Re: Discussion of Blob URI Scheme for Binary Data Access | IETF

On 5/14/11 4:06 AM, Julian Reschke wrote:
> On 13.05.2011 20:05, Arun Ranganathan wrote:
>> Greetings URI listserv!
>
> Hi there!
>
>> ...
>> Additionally, and most significantly for this listserv and this
>> discussion, the File API introduces a URI scheme for Blob access [4].
>> The URI scheme uses a subset of the HTTP status codes, and is designed
>> to be used wherever http URIs can be used within HTML markup and within
>> APIs in JavaScript (e.g. for "img src =", alongside XMLHttpRequest,
>> etc.). The nascent URL API [5] which coins and revokes blob: URIs is
>> also used with the Stream API [6] for video-conferencing use cases, and
>> thus this scheme is becoming integral to emerging technologies under the
>> broad aegis of HTML.
>> ...
>
> A few comments on the definition in "6.7. A URI for Blob and File 
> reference":
>
> "Whereas the file URI scheme (defined in [RFC1630] and [RFC1738]) 
> allows user agents to surface local file and directory structures, it 
> cannot be used within web applications owing to origin considerations 
> and lack of HTTP [HTTP] semantics."
>
> - Citing RFC 1738 should be sufficient, unless 1630 contains any 
> additional information. Does it?

Hi Julien!  Useful feedback as usual.  A few things:

I find that both RFCs 1738 and 1630 jointly provide the broadest 
"official" RFC-level overview of the file URI scheme, so I think it's 
important enough to cite both.  RFC1738 gives us an overview of the 
scheme; RFC1630 gives us an explanation of the intended use, which is 
important when explaining why blob: addresses different use cases than 
what the file:// scheme does.
>
> - You *could* define HTTP semantics for file:, so that doesn't seem a 
> compelling reason not to use it. Actually, browsers do so in XHR 
> already, no?
>

Not really :)  Defining HTTP semantics for the file URI scheme would be 
a late-breaking change to a scheme that's already well understood (and 
dusty with age).  Also, an important distinction here is that the 
file:// URI is best used for the underlying file system, and can more or 
less identify directories, files, and resources that have *at least* a 
semi-permanent structure in that they are stored on disk.  But Blob 
resources are typically short lived, and identify a binary resource *in 
memory* for the lifetime of the Document (or sooner!).  Moreover, a Blob 
resource may at times *only* live in memory, and may not stem from a 
file resource (which could be identified with the "file://" URI 
scheme).  Consider the use of the BlobBuilder API [1], or even the 
Blob.slice API [2].

Browsers behave inconsistently with respect to the file URI scheme and 
XHR, and so it's not really a reliable indicator of what to do next.  
I'd almost say that what browsers do with respect to file URIs and XHR 
is developer-facing programmatic sugar :) There are security 
implications here as well.

> "In general, this scheme is designed to be used wherever URLs can be 
> used on the web."
>
> s/URL/URI/
>
Done.

> "This scheme can allow for fragment identifiers for well-defined 
> format types. For example, if the file selected is an SVG file, this 
> scheme should allow for SVG fragment identifiers. If the file selected 
> is an HTML file, this scheme should allow for fragment identifiers 
> within an HTML document."
>
> This is very misleading. Fragment identifier semantics are a property 
> of the media type, not the URI scheme.
>

OK, I'll work on clarifying language and notify you when I've put it in 
place.


> "Whereas file URIs are subject to strict same origin restrictions for 
> security reasons and allow directory browsing, this scheme is 
> applicable only to user-selected files (and to Blob objects that web 
> applications generate)."
>
> An alternative would have been to relax the same-origin constraints 
> for some file: URIs. Just sayin'.

No, that would be unsafe, and would be a late-breaking change that we 
probably shouldn't undertake.

>
> "A blob: URI consists of the blob: scheme and an opaque string, along 
> with zero or one fragment identifiers."
>
> s/zero or one/an optional/
>
Done.

> "blob = scheme ":" opaqueString [fragIdentifier]"
>
> The fragment identifier should not be part of the scheme definition.
>

OK -- I think your suggestion is to maybe have a separate section that 
discusses fragments?  Is that really necessary?  Is that a convention or 
a stylistic preference?  Since fragments are optional, I've only 
included them for completeness.  What do I gain by a separate section 
that discusses fragments?

> "; opaqueString could be a UUID in its canonical form
> ; opaqueString tokens MUST be unique"
>
> Unique or globally unique? If the latter, how can you be sure without 
> mandating a specific format?

Globally unique -- and I've already made a change suggesting this in the 
specification.  Implementers pushed back on UUID as an explicit choice, 
strongly urging merely a note that suggests UUID usage.

Left to me, I'd like to mandate UUID :)  But I want this specification 
to be widely adopted, and if implementers have nits about mandating 
UUID, I'll challenge them to come up with something UUID-like in 
characteristics.

> Furthermore, you need to state the allowed character repertoire for 
> opaqueString.
>

Yes -- working on this, thanks.  At a minimum I'll prohibit "#" in 
Unicode and ASCII.  I wonder if others should be prohibited.

> ...finally, as this probably will become an FAQ, it would be good to 
> include the rational for not simply using "urn:uuid:" as syntax.
>

OK, I'll add this.

-- A*
[1] 
http://www.w3.org/TR/2011/WD-file-writer-api-20110419/#idl-def-BlobBuilder
[2] http://dev.w3.org/2006/webapi/FileAPI/#dfn-slice

Received on Monday, 16 May 2011 17:05:32 UTC