[whatwg] Specs for window.atob() and window.btoa()

I've written a provisional spec for window.atob() and window.btoa():

http://aryeh.name/spec/base64.html

These are functions supported by all browsers except IE, which do
base64 encoding and decoding.  I also wrote a fairly complete test
suite, at:

http://dvcs.w3.org/hg/html/raw-file/tip/tests/submission/AryehGregor/base64.html

Suggestions for improvements to either the spec or tests are
appreciated.  Some notes:

Browsers disagreed about how to handle input to atob() that can't be
produced by btoa().  Firefox mostly throws exceptions, WebKit is
slightly more lenient, and Opera doesn't throw exceptions at all.
gsnedders told me Opera's behavior caused site compat problems, so I
went with Firefox's behavior, or about as close to Firefox's behavior
as I could determine without reading the source code.  I recorded this
line of reasoning in a source code comment for posterity.

As far as I can tell by writing tests, there are only two cases where
Firefox's atob() doesn't throw an exception on input that can't have
come from btoa().  First, if there are trailing bits after decoding
that aren't all 0, Firefox discards them.  So for instance, atob("YQ")
and atob("YR") both return "a".  Second, it doesn't require trailing =
signs, so atob("YQ") works although btoa("a") is actually "YQ==".  The
latter seems reasonable to keep, but if we dropped the former, I could
get rid of the explicit algorithm and defer to the RFC for decoding
too, making the whole spec much simpler.  In that case, the invariant
btoa(atob(s)) == s would always hold after padding s with an = or two
if necessary, so determining whether a particular case should throw an
exception would be easy, but checking correctness by source-code
inspection would be harder.  What do implementers feel about this?

I used a regex to decide whether browsers should throw an exception
for atob() because it's simpler to spec and test, but it's probably
more annoying to implement.  If implementers would prefer I instead
say at which points in the algorithm to throw an exception, let me
know.  (This is moot if the algorithm can be thrown out entirely.)

Received on Thursday, 6 January 2011 12:25:11 UTC