Has anyone plotted a similar graph for the word "development"? I wonder if there are parallels… and I wonder if we will ever come up with alternative, and hopefully more precise, words to replace "sustainability." In my field of international health and health education, the definition of sustainable is now so wide it seems to mean little more than something we hope won't fail in the near future.

]]>[(myl) Oops; I see. Good point.]

]]>[(myl) No: rather, the prediction is that 100% of our words will be *sustainable*.]

I tried to keep both Stein's Law and Davies's Corollaries in mind during the Housing Bubble. And, sure enough, Corrollary #2 proved true.

]]>[(myl) In the real world, yes. In the xkcd cartoon world, the function is shown an positive-slope straight line on a scale of log proportions, corresponding to an exponential increase. If it were a sigmoid, a log-scale plot (like a linear-scale plot) would approach an upper asymptote without ever reaching it — the only difference between this part of the two plots would be the apparent rate of approach.

The joke in the xkcd plot is to extrapolate the recent *sustainable* trend linearly in terms of lexical probabilities. Whether this is done on a log scale or on a linear scale, it results in a line that eventually passes through 1 (= 100%) and goes beyond into the range labelled with question-marks in the xkcd plot, where (say) 150% of all words are *sustainable*. A straight line on a scale of log proportions accomplishes this absurdity via an exponential increase in proportions; a straight line on a scale of of plain proportions does it more gradually; but in both cases, the end result is nonsensical and thus funny.

In fact, the history of variationist sociolinguistics involved an early period when Bill Labov and others used linear regression on expected proportions or probabilities of occurrence, as in his paper "Contraction, deletion and inherent variability of the English copula", *Language* 1969. Cedergren and Sankoff ("Variable Rules: Performance as a Statistical Reflection of Competence", *Language* 1974) supported the idea of using multiple regression to model the effects of various factors on variation, but pointed out that an additive model on probabilities can yield nonsensical results, predicting that a given outcome will occur less than 0% or more than 100% of the time. They explained that an appropriate way to deal with this is to use a model in which the terms associated with different factors interact as in the class of models known as *logistic regression*. This involves transforming p (for proportion or probability) into log(p/(1-p)), and then fitting an additive model in the transformed space. For some discussion, see these lecture notes.

The xkcd cartoon embodies at least two jokes. The first one mocks the faddish overuse of *sustainable*. The second joke mocks the idea of blind quantitative extrapolation, such as linear extrapolation of (log) occurrence proportions or probabilities. In order to make these jokes work together, it's necessary to plot the (roughly) exponential increase in *sustainable* percentages on a log scale, so that it looks approximately linear and supports the humorous extrapolation to the precisely-dated worlds in which *sustainable* occurs once per sentence, once per word, and beyond…]

For "chicken" read "dude".

Me: or "smurf".

]]>