Re: I-D Action: draft-ietf-httpbis-cookie-prefixes-00.txt

On 7 July 2016 at 14:11, Constantine A. Murenin <cnst@netbsd.org> wrote:

> I would like to submit an objection to
> "draft-ietf-httpbis-cookie-prefixes-00.txt".
>
>
> http://tools.ietf.org/html/draft-ietf-httpbis-cookie-prefixes-00
>
>    3.  Prefixes  . . . . . . . . . . . . . . . . . . . . . . . . . .   3
>>      3.1.  The "__Secure-" prefix  . . . . . . . . . . . . . . . . .   3
>>      3.2.  The "__Host-" prefix  . . . . . . . . . . . . . . . . . .   3
>>
>
>
> The `-` U+002D "HYPHEN-MINUS" character, commonly referred to as "dash",
> is not a valid identifier in most languages, whereas an underscore is.
> ​<snip>
>
>
​In my experience, at least in general-purpose languages, it's a bad idea
to use variable data (especially user-supplied variable data, e.g. cookie
names) as language identifiers (variable names, etc.)

For example, in my C/Java/Ruby/PHP/Perl code I would either keep all the
cookies in some sort of dictionary structure, and refer to them by a
stringy key; or if I was particularly interested in the value of a
particular cookie (e.g. "__Secure-token") I might create a special variable
to hold said value (e.g. `token_cookie`).

If it were a domain-specific language, for example some pseudo-script I use
to configure my web server; if that DSL is supposed to let you treat cookie
names as identifiers, but doesn't let you treat *all possible* cookie names
as identifiers, then that's a limitation of the DSL, not of the
specification that defines those possible cookie names.

I don't think "nginx users can't use hyphens" is a very strong argument
against standardising cookie names with hyphens -- names which are
perfectly legal and may even be in use right now. Maybe nginx.conf could
work out how to quote variable names (​${cookie___Host-foo} for example),
if you're desperate to use user-supplied data as variable names.



>
> I did see your arguments that Cookie names are allowed to have even more
> "exotic" characters than a mere dash, however, what I haven't seen is any
> good argument of why going against the flow in this very instance is of any
> benefit to anyone.
>
> There are already people trying to figure out how to use this spec with,
> for example, nginx, and they're falling short, because current versions of
> nginx do not appear to support such syntax for cookies (unless you go
> looking into `$http_cookie` directly with `map` and regex, which is
> entirely doable, but will definitely slow you down).  I'm pretty sure
> nginx.conf is not the only language where this decision will cause these
> little issues and slowdowns for absolutely no good reason nor benefit.
>
>
​I don't have a problem interacting with these cookies in my Apache
instances, or my Ruby servers, or my Perl sites, or any of our Tomcat
sites. Particularly not in Tomcat, since that's all configured in XML, and
hyphens are waaaay more common than underscores in XML.​



> I opened up Cookie Manager of my favourite browser.  I briefly looked at
> some of the Cookie Names used.  I found that, subjectively, underscores are
> used by about 90% of users, whereas dashes appear in at most 5%, if not
> much less.
>
> There appears to be little reason to not name `__Secure-` and `__Host-` as
> `__Secure_` and `__Host_`, respectively, which would avoid this issue
> entirely.
>
> ​
​Because, in your subjective sample, 95% or more of users will never ever
get a collision, and it's quite likely none of the remaining 5% or fewer
will either.
​
​


> Who else is likely to use the names starting with double underscores?
> Users that are trying to define their own unique namespace to not clash
> with the namespace of the rest of the site.
>
>
Sure, and how many of them are also using hyphens?​



> Then, what we have to ask ourselves is this -- should 99,9999% of people
> be inconvenienced with non-identifier characters because some people may
> have though that "__Host_" or "__Secure_" was a good non-RFC namespace for
> their non-RFC use, or should they be the ones that face breakage for the
> benefit of 99,9999% of the internet not having to figure out how to access
> these cookies with non-identifier characters through their language of
> choice?
>
>
I suggest you may be overestimating just how many people are inconvenienced
here. I read it as "nginx users who want to use $cookie_", which is a much
smaller number. I saw the StackOverflow question [1]; it still seems like a
pretty isolated issue to me.

Cheers

[1]:
http://serverfault.com/questions/788207/how-to-use-a-hypen-in-cookie-names-for-nginx
-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/

On 7 July 2016 at 14:11, Constantine A. Murenin <cnst@netbsd.org> wrote:

> I would like to submit an objection to
> "draft-ietf-httpbis-cookie-prefixes-00.txt".
>
>
> http://tools.ietf.org/html/draft-ietf-httpbis-cookie-prefixes-00
>
>    3.  Prefixes  . . . . . . . . . . . . . . . . . . . . . . . . . .   3
>>      3.1.  The "__Secure-" prefix  . . . . . . . . . . . . . . . . .   3
>>      3.2.  The "__Host-" prefix  . . . . . . . . . . . . . . . . . .   3
>>
>
>
> The `-` U+002D "HYPHEN-MINUS" character, commonly referred to as "dash",
> is not a valid identifier in most languages, whereas an underscore is.
>
>
> See "word", [[:<:]] and [[:>:]] in re_format(7):
>
>                         http://mdoc.su/n/re_format.7
>
>      characters which is neither preceded nor followed by word
>> characters.  A
>>      word character is an alnum character (as defined by ctype(3)) or an
>>      underscore.  This is an extension, compatible with but not specified
>> by
>>
>
>
> See "word", \w, \W, \b, \B etc in pcrepattern(3):
>
>                         http://mdoc.su/n/pcrepattern.3
>
>        A "word" character is an underscore or any character that is  a
>> letter
>>        or  digit.   By  default,  the definition of letters and digits is
>> con-
>>
>
>
> I did see your arguments that Cookie names are allowed to have even more
> "exotic" characters than a mere dash, however, what I haven't seen is any
> good argument of why going against the flow in this very instance is of any
> benefit to anyone.
>
> There are already people trying to figure out how to use this spec with,
> for example, nginx, and they're falling short, because current versions of
> nginx do not appear to support such syntax for cookies (unless you go
> looking into `$http_cookie` directly with `map` and regex, which is
> entirely doable, but will definitely slow you down).  I'm pretty sure
> nginx.conf is not the only language where this decision will cause these
> little issues and slowdowns for absolutely no good reason nor benefit.
>
> I opened up Cookie Manager of my favourite browser.  I briefly looked at
> some of the Cookie Names used.  I found that, subjectively, underscores are
> used by about 90% of users, whereas dashes appear in at most 5%, if not
> much less.
>
> There appears to be little reason to not name `__Secure-` and `__Host-` as
> `__Secure_` and `__Host_`, respectively, which would avoid this issue
> entirely.
>
> Who else is likely to use the names starting with double underscores?
> Users that are trying to define their own unique namespace to not clash
> with the namespace of the rest of the site.
>
> Then, what we have to ask ourselves is this -- should 99,9999% of people
> be inconvenienced with non-identifier characters because some people may
> have though that "__Host_" or "__Secure_" was a good non-RFC namespace for
> their non-RFC use, or should they be the ones that face breakage for the
> benefit of 99,9999% of the internet not having to figure out how to access
> these cookies with non-identifier characters through their language of
> choice?
>
> Cheers,
> Constantine.
>
> http://Constantine.SU/
> http://www.netbsd.org/people/developers.html#cnst
>
>


-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/

Received on Thursday, 7 July 2016 05:14:59 UTC