draft-duerst-mailto-bis-05: Detailed review of '@' and '+'

In an http URI, if I want to put '@' and '+' in an hfvalue, I need to percent-encode them like this:
http://example.com/test.php?hfname=%40%2B

Now, in a mailto URI, which is of the form:

mailto:to_hfvalue?hfname=hfvalue&hfname=hfvalue

, I would expect to do the same like this for example:

mailto:%40%2B?hfname=%40%2B

However, <http://tools.ietf.org/html/draft-duerst-mailto-bis-05> greatly confuses me as to whether '@' and '+' need to be percent-encoded in hfvalues or not. It talks about '@' (and indirectly about '+') not needing to be percent-encoded in an addr-spec. However, it doesn't say for sure what happens to '@' and '+' in an addr-spec once you put the addr-spec in an hfvalue. It defers to the URI spec though, which says that '@' and '+' must be percent-encoded to %40 and %2B. But, that seems to contradict the mailto *uri* examples in the mailto spec, which leave '@' and '+' unencoded.

However, my interpretation is that, before you put an addr-spec in an hfvalue, you have to percent-encode (like ECMAScript's encodeURIComponent) the whole addr-spec first. So, for example, if I wanted the string "1+2@example.com" to end up in the To, Cc, Bcc, Body and Subject fields, I'd produce:

mailto:1%2B%40example.com?subject=1%2B%40example.com&body=1%2B%40example.com&cc=1%2B%40example.com&bcc=1%2B%40example.com

or

mailto:?to=1%2B%40example.com&subject=1%2B%40example.com&body=1%2B%40example.com&cc=1%2B%40example.com&bcc=1%2B%40example.com

(Given that the spec says mailto:to_hfvalue and mailto:?to=hfvalue are equivalent)

Then, the client would percent-decode (like ECMAScript's decodeURIComponent) each of the hfvalues to "1+2@example.com" and put it in each of the compose/header fields.

However, the examples in the spec seem to keep the '@' in the addr-specs in raw form even after putting them in a URI like:

mailto:1+2@example.com?cc=1+2@example.com

That doesn't seem correct. It does kind of seem correct for the part between 'mailto:' and '?', given that the ABNF says [ addr-spec *("%2C" addr-spec ) ] instead of to_hfvalue or something (and that it's before '?' which would make it like the raw '@' in the username:password part in an http URI). But, it doesn't seem correct at all for the cc hfvalue.

But, if the part between 'mailto:' and '?' is not an hfvalue at all and is just a raw, comma-separated addr-spec list where certain characters like ',' and ' ' need to be percent-encoded, then I guess that would explain things partially. However, if that's true, then it would mean that mailto:value is NOT equivalent to 'mailto:?to=value' in the sense of how they are percent-encoded, which would contradict the spec saying that they are equivalent.

This then would have me believe that
mailto:1+2@example.com?cc=1%2B2%40example.com

would be correct as only the second value is really an hfvalue.

However, it has been suggested to me that '@' and '+' do not need to be percent-encoded in [to], or in any hfvalue if the hfvalue is known to hold addr-specs, but they do need to be percent-encoded in all other hfvalues. However, this doesn't seem correct as how would you know if the hfvalue was properly encoded for arbitrary hfvalues like "mailto:?zipzambam=" where you wouldn't know?

It seems to me that:

mailto:to_hfvalue?subject=hfvalue&body=hfvalue&cc=hfvalue&bcc=hfvalue
and
mailto:?to=hfvalue&subject=hfvalue&body=hfvalue&cc=hfvalue&bcc=hfvalue

should be equivalent to:

http://example.com/compose.php?to=hfvalue&subject=hfvalue&body=hfvalue&cc=hfvalue&bcc=hfvalue

for how the values are percent-encoded.

Or, is the spec just saying that in a mailto URI, '@' and '+" are not reserved for anything, so they can|should|must appear in raw form throughout the mailto URI whether they're in [to] or an hfname or hfvalue?

Yet, I'm just not sure what the spec is saying. Could *many* people clear this up? And, before you answer, please ask yourself if you're sure (given the text in the spec).

In addtion, I'll also ask the question in a different way. Given the following 2 functions, what does the spec say?

Does it say:

Use the first function for producing the [to] value and the second function for producing hfnames and hfvalues?

Or, does it say to use the second function to produce all values?

Or, does it say to use the first function to produce all values?

Or, does it say to use the first function for all values that are known to contain addr-specs and the second function for the rest?

<script>
function encodeToComponent(to) {
    try {
        return encodeURIComponent(to).replace(/\r\n|r|\n/g, "\r\n").replace(/(%40)|(%2B)/gi, function(match, at, plus) {
            if (at) return "@";
            if (plus) return "+";
        });
    } catch(e) {
        return "percent-encode%20error";
    }
}

function encodeHfnameOrHfvalueComponent(value) {
    try {
        return encodeURIComponent(value.replace(/\r\n|r|\n/g, "\r\n"));
    } catch (e) {
        return "percent-encode%20error";
    }
}
var test = "1+2@example.com";
alert(encodeToComponent(test));
alert(encodeHfnameOrHfvalueComponent(test));
</script>

Thanks

-- 
Michael

Received on Wednesday, 11 March 2009 02:42:44 UTC