Re: data URIs - filename and content-disposition

On 26.02.2010 15:33, Michael A. Puls II wrote:
> ...
>>>> So I have a slight preference to keep things simple, and to focus on
>>>> the specific use case.
>>>
>>> Well, I'm personally happy with just:
>>>
>>> data:text/plain;charset=utf-8;content-disposition=attachment;filename=name,
>>>
>>>
>>> (that could even be shortened to just disposition=attachment)
>>>
>>> I just suggested the more flexible way as I figured that's what most
>>> people would want.
>>>
>>> Now, if we do it just the simple way, how should the filename value be
>>> encoded? Just percent-encoded UTF-8? That'd be fine by me because I
>>> could just use encodeURIComponent() to produce the value.
>>
>> We'll need to define which characters need to be percent-escaped,
>> though. Obviously all non-URI characters, but also those needed to
>> parse the parameters, so minimally ";".
>
> Well, encodeURIComponent basically percent-encodes anything not in
> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.!~*'()"
>
> , which is great to me. That covers encoding ; too.
> ...

Let's see. The data URI scheme RFC (RFC2397) uses token/attribute/value 
from RFC 2045, which has 
(<http://greenbytes.de/tech/webdav/rfc2045.html#rfc.section.5.1>):

      value := token / quoted-string

      token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
                  or tspecials>

      tspecials :=  "(" / ")" / "<" / ">" / "@" /
                    "," / ";" / ":" / "\" / <">
                    "/" / "[" / "]" / "?" / "="
                    ; Must be in quoted-string,
                    ; to use within parameter values

so any new data URI parameter should accept both tokens and quoted strings.

Also, tspecials appears to include a few things encodeURIComponent doesn't.

The devil is in the details. This will need examples and test cases.

Best regards, Julian

Received on Friday, 26 February 2010 14:52:53 UTC