W3C home > Mailing lists > Public > whatwg@whatwg.org > September 2011

[whatwg] Proposal for improved handling of '#' inside of data URIs

From: Daniel Holbert <dholbert@mozilla.com>
Date: Sat, 10 Sep 2011 18:30:20 -0700
Message-ID: <4E6C0F2C.8090702@mozilla.com>
On 09/10/2011 04:53 PM, Nils Dagsson Moskopp wrote:
 >> Browsers handle the "#" character in data URIs very differently, and
 >> the arguably "correct" behavior is probably not what authors actually
 >> want in many cases.
 > Do you have any evidence for that assertion, e.g. author surveys,
 > occurance in sites, number of duplicates in mozilla bugzilla (relative
 > to other common bugs)?

No large-scale data like that, just a few anecdotal reports in IRC of 
Firefox purportedly being "broken" on particular content (that contained 
a "#"), whereas Chromium was "working".  (one instance about a week ago, 
which prompted this proposal)

Plus, a concern that people can *almost* just stick pure HTML/SVG into a 
data URI (see examples below) except for "#" characters which break things.

 > This change would probably have to be communicated to other software
 > working with data URIs (Python's urlparse module comes to mind).

Sure, ultimately. One step at a time.

 > Do you
 > intend to update the RFC on the point or leave that usage
 > non-conforming?

I'm not sure. Right now this is just a proposal for better 
interoperability, but ultimately, yeah, it'd be great to have this 
specified.

 >> Note that in cases where an author *accidentally* includes "#" inside
 >> their data URI (e.g.<body background="#f00">),
 >
 > What's with the unencoded bracket (should be %3C) and space (should be
 > %20) beforehand? Why wouldn't parsing stop at those points?

Those are fine, actually -- but I should have included an actual URI 
that loads in browsers, like the following (this is what I meant):
   data:text/html,<html><body style='background: #f00'>

So to answer your question -- that does render just fine (giant red 
page) in Chromium, without any need to encode the space or the brackets. 
  It also renders fine in Opera if you point an <iframe> at it.  (but 
not if you type it directly into the URLbar -- that's the inconsistency 
on their part that I mentioned in my post)

And in Firefox, it renders fine if you just encode the # character:
   data:text/html,<html><body style="background:%23f00">
(that makes it load fine from Opera's URLbar, too.)

So no -- practically at least, there's no need to encode the >/< or the 
space character.

~Daniel
Received on Saturday, 10 September 2011 18:30:20 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:36 UTC