- From: Jonathan Rees <jar@creativecommons.org>
- Date: Sat, 28 Nov 2009 19:37:48 -0500
- To: Larry Masinter <masinter@adobe.com>
- Cc: "noah_mendelsohn@us.ibm.com" <noah_mendelsohn@us.ibm.com>, "www-tag@w3.org" <www-tag@w3.org>
On Fri, Oct 16, 2009 at 3:30 AM, Larry Masinter <masinter@adobe.com> wrote: > In response to an administrative request to review issues, I thought > I would send this more publicly: ... > ISSUE-41: "designing extensible languages and for > handling versioning?" > I'd put Noah's overview of DE and the JAR and LMM (and Raman?) > work on versioning and payoffs as further actions. > > Larry > -- > http://larry.masinter.net I know this is probably tiring given the previous messages I've sent on the subject, but I am sending it so that it is in the email archive, both in case I want to refer to it at the F2F, and for the amusement of those who have a perverse fascination with this kind of thing. Also to get it out of my system. -Jonathan The following is a *model*, a caricature of language change. Any resemblance it bears to reality is just lucky. I talk about "meanings" without depending on their having any particular properties. The narrative works with almost anything else substituted, such as "number" or "command". ```` Communication happens when a message correctly conveys a meaning, i.e. when a consumer understands what a producer "means" by a message. That is, the producer P wants to convey a meaning Q, so it composes a message M = P(Q) whose meaning is Q. The consumer C receives M and assigns it a meaning Q' = C(M). The outcome is a payoff to producer and consumer (whose interests I assume to be aligned) based on how well the intended meaning Q matches the understood meaning Q'. Call this payoff z(Q, Q'). (Note that a language designer or specification writer might influence P and C, but payoff is determined by uncontrolled factors.) If z(Q, Q) is not positive, then a self-interested producer will never voluntarily send a message M for which C(M) = Q: such messages will be unused (def: "used" = in the image of P). There might be meanings Q that are not in the image of C, i.e. no message elicits them. These meanings might be called "inexpressible". If the payoff z(Q, Q) of an inexpressible meaning is high, there will be pressure to modify the language so that the meaning becomes expressible. One definition of "language" (or "language version") might be: a factoring of the identity function through a message space, i.e. a pair of functions (P, C) such that C(P(Q)) = Q. (P is necessarily injective.) Consider a language-change scenario in which producers and consumers change one at a time. When a producer changes it assigns a new message to some meaning: P(Q) = M becomes P'(Q) = M'. When a consumer changes it assigns a new meaning to some message: C(M) = Q becomes C'(M) = Q'. When both change in concert we might end up with z(Q', Q') > z(Q, Q), in which case everyone will be happier, especially if Q' had formerly been inexpressible. Ideally a language change does not entail intermediate states in which some producers or consumers are "disenfranchised" in the sense that a formerly available nonnegative payoff decreases, even ephemerally. If the consumer changes first, i.e. Q' = C'(M) != C(M) = Q for some message M, we are OK as long as either (a) M is not used by old producers, or (b) z(Q, Q') >= z(Q, Q), i.e. the new meaning is somehow "not any worse" than the one that was expected. This might be called a "backward compatible" change or "extension". (Language specification bug fixes might be related to this?) If the producer changes first, i.e. M' = P'(Q) != P(Q) = M for some meaning Q (n.b. if Q was inexpressible there is no M), we are OK as long as z(Q, C(M')) >= 0, i.e. the producer is not punished for sending M' instead of M. Since it's the producer that's changing, it should expect that it might be talking to old consumers, so it has no right to be particularly disappointed with M' being interpreted as having its old meaning C(M'), and it will prepare accordingly. But if the payoff z(Q, C(M')) when an old consumer C receives M' weren't nonnegative, the new producer might be punished (negative payoff) for sending to old consumers, and that would have an inhibiting effect. A change of this kind has been called "forward compatible". ... Language change can occur only through use of a communication channel. Somehow producer and consumer must decide, or learn, in concert, new message/meaning assignments. The special channel could be a side channel using a completely different language (e.g. a language spec sent to a software developer), or it could be a feedback channel to be used for reinforcement learning (e.g. rejection of a deprecated message), or it could be a special way to use the language itself (e.g. a message whose meaning is to change the state of the consuming agent so that it replaces its meaning function C with C'). ... Examples: - A new message/meaning assignment should be expected to spread only when it confers some substantial benefit (the meaning is both new and useful) E.g. <video> is likely to spread; <hrule> as a synonym for <hr> wouldn't. - Neutral changes can spread as "parasites" of beneficial changes (as when changes are batched in a software release). - Languages that permit multiplexed communication to different consumer populations ought to best promote the propagation of changes (e.g. #ifdef in C - "if you are an old consumer, treat this message as M, but if you are new one, then treat it as M' ") [but features like #ifdef have other costs!] - Languages for which P is surjective (no unused messages) and payoff is positive (z(Q, Q) > 0 for all Q in the image of C) are extremely difficult to change, as the use of every possible message will be claimed or entrenched by the language community. An example would be ASCII. - If C(M) is "an error" (must lead to an error report or something similar, as opposed to being processed in a useful way), how would a population resist a language change that converts C(M) to some useful behavior? Answer: It can't, unless "cheating" (making changes) is somehow more painful (or less beneficial, lower payoff) than the error behavior. If there's no cheating, it's only because the outcome of doing so is perceived to be even worse than being punished by an error report. ... Suppose the old and new languages support communication of a "language version indicator". That is, there is a third language that is the disjoint sum of two others: a message M'' consists of a version 1 indicator followed by a message M in language version 1, or a version 2 indicator followed by a message M' in language version 2. (This is similar to the #ifdef example above.) An indicator could be used in a variety of ways: . A version 2 indicator could be a signal to version 1 consumers that they won't understand what's coming. This won't enable communication in any way, but when a language change is incompatible it might prevent errors of misinterpretation (<0 payoff) by forcing the message to be discarded entirely (0 payoff). . A version 1 indicator could be a signal to version 2 consumers that version 1 meaning assignments should be chosen for incoming messages. This would be useful if (and only if) the language change were incompatible (lower payoffs). . An indicator might serve a function at a different niche in the ecology. For example, a transforming consumer/producer (e.g. an editor) could use a version 1 indicator as a signal of limited capabilities of some downstream consumer, and avoid introducing version 2 features into messages. (thanks Larry) ... The unused messages constitute a frontier that can be colonized through language changes. The pressure to colonize will be constant in any language population, as there will always be inexpressible meanings striving to find expression. Poorly coordinated changes risk conflict, e.g. two independently arising changes might define conflicting behavior for messages containing "<circle>", thence punishing payoffs. Change coordination is a sociotechnological problem with a variety of solutions. The simplest strategy is to eliminate the frontier by defining beneficial consumer behavior for all messages and encouraging investment in the entire territory. A central coordination point is another strategy; hierarchical delegation is yet another; randomization is yet another.
Received on Sunday, 29 November 2009 00:38:22 UTC