- From: Matthew Kerwin <matthew@kerwin.net.au>
- Date: Thu, 7 Jul 2016 15:14:10 +1000
- To: "Constantine A. Murenin" <cnst@netbsd.org>
- Cc: internet-drafts@ietf.org, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-ID: <CACweHNDPZhrvE+uNBYenVBbF78U46LyXt0aqmYFSs=dn7dJuzA@mail.gmail.com>
On 7 July 2016 at 14:11, Constantine A. Murenin <cnst@netbsd.org> wrote: > I would like to submit an objection to > "draft-ietf-httpbis-cookie-prefixes-00.txt". > > > http://tools.ietf.org/html/draft-ietf-httpbis-cookie-prefixes-00 > > 3. Prefixes . . . . . . . . . . . . . . . . . . . . . . . . . . 3 >> 3.1. The "__Secure-" prefix . . . . . . . . . . . . . . . . . 3 >> 3.2. The "__Host-" prefix . . . . . . . . . . . . . . . . . . 3 >> > > > The `-` U+002D "HYPHEN-MINUS" character, commonly referred to as "dash", > is not a valid identifier in most languages, whereas an underscore is. > <snip> > > In my experience, at least in general-purpose languages, it's a bad idea to use variable data (especially user-supplied variable data, e.g. cookie names) as language identifiers (variable names, etc.) For example, in my C/Java/Ruby/PHP/Perl code I would either keep all the cookies in some sort of dictionary structure, and refer to them by a stringy key; or if I was particularly interested in the value of a particular cookie (e.g. "__Secure-token") I might create a special variable to hold said value (e.g. `token_cookie`). If it were a domain-specific language, for example some pseudo-script I use to configure my web server; if that DSL is supposed to let you treat cookie names as identifiers, but doesn't let you treat *all possible* cookie names as identifiers, then that's a limitation of the DSL, not of the specification that defines those possible cookie names. I don't think "nginx users can't use hyphens" is a very strong argument against standardising cookie names with hyphens -- names which are perfectly legal and may even be in use right now. Maybe nginx.conf could work out how to quote variable names (${cookie___Host-foo} for example), if you're desperate to use user-supplied data as variable names. > > I did see your arguments that Cookie names are allowed to have even more > "exotic" characters than a mere dash, however, what I haven't seen is any > good argument of why going against the flow in this very instance is of any > benefit to anyone. > > There are already people trying to figure out how to use this spec with, > for example, nginx, and they're falling short, because current versions of > nginx do not appear to support such syntax for cookies (unless you go > looking into `$http_cookie` directly with `map` and regex, which is > entirely doable, but will definitely slow you down). I'm pretty sure > nginx.conf is not the only language where this decision will cause these > little issues and slowdowns for absolutely no good reason nor benefit. > > I don't have a problem interacting with these cookies in my Apache instances, or my Ruby servers, or my Perl sites, or any of our Tomcat sites. Particularly not in Tomcat, since that's all configured in XML, and hyphens are waaaay more common than underscores in XML. > I opened up Cookie Manager of my favourite browser. I briefly looked at > some of the Cookie Names used. I found that, subjectively, underscores are > used by about 90% of users, whereas dashes appear in at most 5%, if not > much less. > > There appears to be little reason to not name `__Secure-` and `__Host-` as > `__Secure_` and `__Host_`, respectively, which would avoid this issue > entirely. > > Because, in your subjective sample, 95% or more of users will never ever get a collision, and it's quite likely none of the remaining 5% or fewer will either. > Who else is likely to use the names starting with double underscores? > Users that are trying to define their own unique namespace to not clash > with the namespace of the rest of the site. > > Sure, and how many of them are also using hyphens? > Then, what we have to ask ourselves is this -- should 99,9999% of people > be inconvenienced with non-identifier characters because some people may > have though that "__Host_" or "__Secure_" was a good non-RFC namespace for > their non-RFC use, or should they be the ones that face breakage for the > benefit of 99,9999% of the internet not having to figure out how to access > these cookies with non-identifier characters through their language of > choice? > > I suggest you may be overestimating just how many people are inconvenienced here. I read it as "nginx users who want to use $cookie_", which is a much smaller number. I saw the StackOverflow question [1]; it still seems like a pretty isolated issue to me. Cheers [1]: http://serverfault.com/questions/788207/how-to-use-a-hypen-in-cookie-names-for-nginx -- Matthew Kerwin http://matthew.kerwin.net.au/ On 7 July 2016 at 14:11, Constantine A. Murenin <cnst@netbsd.org> wrote: > I would like to submit an objection to > "draft-ietf-httpbis-cookie-prefixes-00.txt". > > > http://tools.ietf.org/html/draft-ietf-httpbis-cookie-prefixes-00 > > 3. Prefixes . . . . . . . . . . . . . . . . . . . . . . . . . . 3 >> 3.1. The "__Secure-" prefix . . . . . . . . . . . . . . . . . 3 >> 3.2. The "__Host-" prefix . . . . . . . . . . . . . . . . . . 3 >> > > > The `-` U+002D "HYPHEN-MINUS" character, commonly referred to as "dash", > is not a valid identifier in most languages, whereas an underscore is. > > > See "word", [[:<:]] and [[:>:]] in re_format(7): > > http://mdoc.su/n/re_format.7 > > characters which is neither preceded nor followed by word >> characters. A >> word character is an alnum character (as defined by ctype(3)) or an >> underscore. This is an extension, compatible with but not specified >> by >> > > > See "word", \w, \W, \b, \B etc in pcrepattern(3): > > http://mdoc.su/n/pcrepattern.3 > > A "word" character is an underscore or any character that is a >> letter >> or digit. By default, the definition of letters and digits is >> con- >> > > > I did see your arguments that Cookie names are allowed to have even more > "exotic" characters than a mere dash, however, what I haven't seen is any > good argument of why going against the flow in this very instance is of any > benefit to anyone. > > There are already people trying to figure out how to use this spec with, > for example, nginx, and they're falling short, because current versions of > nginx do not appear to support such syntax for cookies (unless you go > looking into `$http_cookie` directly with `map` and regex, which is > entirely doable, but will definitely slow you down). I'm pretty sure > nginx.conf is not the only language where this decision will cause these > little issues and slowdowns for absolutely no good reason nor benefit. > > I opened up Cookie Manager of my favourite browser. I briefly looked at > some of the Cookie Names used. I found that, subjectively, underscores are > used by about 90% of users, whereas dashes appear in at most 5%, if not > much less. > > There appears to be little reason to not name `__Secure-` and `__Host-` as > `__Secure_` and `__Host_`, respectively, which would avoid this issue > entirely. > > Who else is likely to use the names starting with double underscores? > Users that are trying to define their own unique namespace to not clash > with the namespace of the rest of the site. > > Then, what we have to ask ourselves is this -- should 99,9999% of people > be inconvenienced with non-identifier characters because some people may > have though that "__Host_" or "__Secure_" was a good non-RFC namespace for > their non-RFC use, or should they be the ones that face breakage for the > benefit of 99,9999% of the internet not having to figure out how to access > these cookies with non-identifier characters through their language of > choice? > > Cheers, > Constantine. > > http://Constantine.SU/ > http://www.netbsd.org/people/developers.html#cnst > > -- Matthew Kerwin http://matthew.kerwin.net.au/
Received on Thursday, 7 July 2016 05:14:59 UTC