## Friday, November 30, 2007

I read the other day an article by Douglas Hofstadter on his game Mediocrity: three players choose a number each and the winner is that who has chosen the number in the middle. Hofstadter describes the game and then go on with designing arrangements for several types of tournaments, but he does not devote much time to the optimum strategy of the basic game itself; this is somewhat surprising because thinking about how to best play induces some kind of interesting circular reasoning. The following is an attempt to do that analysis. The game is dependent on the number of choices each player is given.

The N-option Mediocrity game: For a fixed N≥1, player A can only choose among 1,4,...,3N-2, player B among 2,5,...,3N-1 and player C among 3,6,...,3N. The winner is the player who has chosen the number in the middle.

Observe that there cannot be ties. For the purposes of our investigation, let's add some constraints:

• Coalitions are not allowed.
• The players are assumed to be perfectly rational, i.e. they are motivated exclusively by their own payoff and will be able to pursue any logical argument to that end; moreover, they know that each player knows about their opponents' perfect rationality.

"I must choose among 1,4,...,3N-2. Clearly 1 can never be in the middle, so I discard it. Player B, however, knows I won't play 1, so she won't choose 2, and thus player C, who can also deduce this, won't play 3. Hence 4 is not an option for me either, and this implies B won't play 5,..."

So it seems like A cannot rationally choose any number, and similar reasonings can be derived for B and C. But of course in the end they must choose some number and someone has to win. This puzzle is reminiscent of the much debated Surprise Examination Paradox; the difference, however, is that Mediocrity involves no modal statements about belief and such, so maybe it is easier to solve.

When N is small the analysis of the game is in fact straightforward:

The 1-option game

 1 2 3 A B C

As each player has only one number to play to, the automatic winner is B.

The 2-option game

 1 2 3 4 5 6 A B C A B C

A can't play 1, so she chooses 4 and C can't go for 6 so she plays 3. B loses regardless of her choice, so there are two possible outcomes with no particular preference: (A←4,B←2,C←3) and (A←4,B←5,C←3), bolds mark the winner.

When N≥3 things look considerably harder. We will look at this in a later entry.

## Monday, November 26, 2007

### All the King's philosophers: ultrafilters and Arrow's Theorem

Given some set A, an ultrafilter on A is a nonempty family F of subsets of A such that
1. A belongs to F,
2. X belongs to F if and only if AX does not belong to F,
3. if X belongs to F and Y is a superset of X, Y belongs to F,
4. if X and Y belongs to F, the intersection of X and Y does belong to F as well.
An ultrafilter is said principal if it is of the form {X included in A : a belongs to X} for some fixed element a of A.
Ultrafilters are used in Model Theory to construct new models for a logical theory out of a family of preexisting models. Principal ultrafilters are of little use in this context, as they yield models that are isomorphic to one of the elements of the given family. The following result basically tells us that finite ultrafilters are of no interest:
Theorem. If A is finite, every ultrafilter on A is principal.
This is the fundamental difficulty that al-Khwārizmī found when trying to build a Philosopher Rule for the Council. He could only have built a non-trivial rule if the Council were infinite!
Arrow's Theorem is a celebrated result by economist Kenneth Arrow that shows some inherent problems of certain voting systems:
Theorem. Let P be a population of voters each having an associated preference ordering a set of candidates C. A social welfare function is a rule that accepts any combination of individual orderings and generates a global ordering for the candidates. Suppose that the SWF has the following properties:
1. When all voters prefer candidate a over candidate b, the SWF must also have this preference.
2. If for a given voter input I the SWF prefers a over b and then we switch to a different input I' such that the individual voters favoring a over b are a superset of those in I, the new output of the SWF still favors a wih respect to b.
Then, if P is finite and there are more than two candidates, it turns out that the SWF necessarily implements a dictatorship, i.e. there is some individual d in P such that the results of the SWF are exactly the preferences expressed by d.
The similarities between Arrow's Theorem and the basic theorem on finite ultrafilters are not coincidental, and a quick search on the net reveals that some have indeed proved that both results are in fact equivalent.

## Thursday, November 22, 2007

### Within 50 km

Recently the Spanish government announced its goal that the national high-speed railroad network (AVE) be developed so that every town in Spain is within 50 km from an AVE station. On the other hand, the estimations by the government itself are that the AVE network total length will reach 2,230 km by 2010. Are these statements compatible?

Let us consider the problem of laying out a railroad network with minimum length such that every point within the considered area (the national territory) is not farther than some predefined distance away from the closest network node. Solving this problem for an arbitrary shape like that of Spain can only be done via computer simulation, but we can introduce drastic simplifications that allow us to come up with a reasonable estimate using only some elementary calculations.

Ignoring the boundary effects, an hexagonal grid is the most efficient way to distribute points on the plane so as to maximize the area covered by circles with a fixed radius centered at the points.

For our problem, we have

R = 50 km,
r = ((√3)/2) R = 43.30 km,
A = (3(√3)/2) R2 = 6,495 km2.

The continental area of Spain is 492,173 km2, so we need a total of n = 76 hexagons (or nodes) to cover the country. As an approximation to the optimum network, we can simply take a minimum spanning tree connecting all the 76 nodes. Again, finding such a tree is a computationally hard problem, but we need not do it, as we are only interested in its length L: a tree with n nodes has n-1 edges, and using the reasonable assumption that our minimum spanning tree connects nodes exclusively with their neighbors we have

L = 2r(n-1) = 6,495 km,

which is nearly three times the planned length of the AVE network by 2010.

## Sunday, November 18, 2007

### All the King's philosophers: part III

(See part I, part II.)

The cool breeze raising from the Zeravshan river awoke the mathematician on the morning of the third day. Al-Khwārizmī skipped his breakfast and went directly to his worktable in the royal garden, where, according to his instructions, new loads of paper were already in place. As the hours of the day passed by, the initially light mood of Al-Khwārizmī progressively turned somber, and his writing frantic; around him, scattered on the floor, lay dozens of sheets of paper full with garbled diagrams and tables.

This routine went on during the following days, and the King, a little impatient about the outcome of al-Khwārizmī's quest, began to send messengers to secretly observe the mathematician from the distance. This would not talk to anyone and abruptly dismissed the servants who approached him with food and wine. He had some lights set up at his place on the veranda so that he could continue working during the night.

One chilly morning Al-Khwārizmī found that he had fallen asleep at his workplace. He then decided that the quest was over and asked for an audience with the King. After explaining to the sovereign how he devised the principles of his theory for a Philosopher Rule, the mathematician then concluded:

"I have spent several days and nights searching for suitable lists of authoritative groups, beginning with some initial disposition and adding and supressing groups according to the logical principles that I have presented. And anytime I obtained a distribution conforming to the principles, I realized to my dismay that it possesed the following defect: there was some particular philosopher such that a group was authoritative in as much as that philosopher belonged to it. If I built some other suitable different distribution, it was only that particular philosopher that changed, but the troublesome peculiarity remained. What is the usefulness of a Rule if it merely takes into account the opinion of a single philosopher in the Council?"

"My lord, in despair and humiliation I must resign from the task that you gratiously assigned me. As far as my investigations have led me to see, there is no way to design a logical Philosopher Rule other than following the decisions of a single designated individual in your Council."

"Great al-Khwārizmī" replied the King, "it is a true sign of the wisest to acknowledge what one does not know. And you have my admiration for this sincerity and for the fine mathematical work you have presented to me."

"Given that your investigations show that having a Council of philosophers is no better than relying in one of them, I can not think of any other man than you who is more suitable for that position. Would you accept my humble invitation to be the Royal Counselor?"

In the end al-Khwārizmī declined the proposition and promptly left Samarkand to go on his trip to remote countries. History has it that the King dismantled the Council and devoted himself to the study of Science and Philosophy so as to become a better person, a wiser ruler, and his best own counselor.

## Wednesday, November 14, 2007

### λ-fun in metaC++: Church numerals

The classical way to represent natural numbers in λ-calculus was invented by Church:

0 = λfx.x
1 = λfx.fx
2 = λfx.f(fx)
...
N = λfx.fn(x)
This translates to our C++ embedding as follows:
`struct ZERO;template<typename F,typename X>struct apply<apply<ZERO,F>,X>{typedef X type;};struct ONE;template<typename F,typename X>struct apply<apply<ONE,F>,X>:apply<F,X>{};struct TWO;template<typename F,typename X>struct apply<apply<TWO,F>,X>:``apply<F,typename apply<F,X>::type>{};...`
The corresponding basic arithmetic operations are easy to define:
SUCC = λnfx.f(nfx) (successor function)
PLUS = λmnfx.nf(mfx)
MULT = λmn.m(PLUS n) 0
and to translate to C++:
`// syntax sugar to abbreviate notationtemplate<typename F,typename G,typename H>struct apply2{  typedef typename apply<typename apply<F,G>::type,H>::type type;};struct SUCC;template<typename N,typename F,typename X>struct apply<apply<apply<SUCC,N>,F>,X>:  apply<F,typename apply2<N,F,X>::type>{};struct PLUS;template<typename M,typename N,typename F,typename X>struct apply<apply<apply<apply<PLUS,M>,N>,F>,X>:  apply2<N,F,typename apply2<M,F,X>::type>{};struct MULT;template<typename M,typename N>struct apply<apply<MULT,M>,N>:  apply2<M,typename apply<PLUS,N>::type,ZERO>{};`

I've written a little C++ program exercising these ideas (you'll need Boost to compile it). How to test our constructs? That is, how to verify for instance that the type `NINE` defined as:

`typedef apply<apply<MULT,THREE>::type,THREE>::type NINE;`
actually corresponds to λfx.f(f(f(f(f(f(f(f(fx)))))))))? Well, if we define
`template <int N>struct int_{};struct succ;template<int N> struct apply<succ,int_<N> >{  typedef int_<N+1> type;};`
then we have that
`apply<apply<N,succ>::type,int_<0> >::type`
is the same type as
`int_<n>`

as long as `N` stands for the Church numeral N. This verification can be easily done with some metaprogramming machinery. Note that `succ` is not a proper λ-term even though we have used it inside our `apply` machinery: this hints at a general technique called λ-calculus extension.

## Saturday, November 10, 2007

### A hero's job

The most popular 2-person non-zero sum conflictive games of cooperation/defection, like the Prisoner's dilemma, the Chicken game etc. have mutual cooperation as a Pareto optimal point, i.e. moving from mutual cooperation to defection harms the opponent. But what happens when this is not the case and yet mutual defection is a "Pareto pessimal", i.e. worst for both players?

Jan and Olga are travelling back home from Mars inside their tiny space capsule. The oxigen generator breaks down and reserves are just enough for only one person to make it to the Earth. Both space travellers glimpse at each other and then stare through the window into the infinite void sourrounding them.

In this game, a hero is needed for the other to obtain some payoff, but mutual cooperation is as bad as mutual defection:

 A \ B Coop Def Coop 0\0 0\1 Def 1\0 0\0

Even more than in the Prisoner's dilemma, the rational strategy for this game is to defect. Some have shown how cooperation can be justified in the iterated version of the Prisoner's dilemma, but in our rendering of the hero game no iteration is possible.

In my opinion, if the matrix above actually corresponds to the payoffs obtained by A and B, then there's little more to be said above the game and defecting is indeed the expected strategy. But what motivates a hero is the emotional reward he or she obtains from knowing the good that the heroic action is doing to others. A payoff matrix accounting for this moral compensation would look like this:

 A \ B Coop Def Coop aA\aB aA\1 Def 1\aB 0\0

Where aA and aB are the altruism factors of A and B, respectively. Usually each player does not know the altruism factor of the opponent, so the game becomes one of imperfect information. Also, it can be argued that in the case of mutual cooperation the payoff would still be zero, as the altruism factor should be multiplying the actual outcome of one's heroic action, but in the standard exposition of these type of games players take their choices simultaneously without knowing the opponent's move.

It does not take much Game Theory, anyway, to understand that heroic action is based on intangible payoffs like the love for others or for the country, honor, etc. Raising these altruism factors is arguably one of the goals of the highly ritualistic military culture.

If we extend the hero game to n persons in a way that still it only takes a hero to save the rest, we can see that, from a societal point of view, it is desirable that altruism be not evenly distributed but rather concentrated in a few individuals: otherwise we'd have mass immolation.

## Tuesday, November 6, 2007

### All the King's philosophers: part II

(See part I.)

Al-Khwārizmī spent the night at the royal palace. Very early the next morning he looked for a cool and quiet place on a veranda by the palace garden, asked for a few sheets of the famous Samarkand paper and some ink and put himself to work on the quest for the Philosopher Rule. The following is a mildly adapted version of the annotations made by al-Khwārizmī during that day.

"To simplify things, let us assume that questions posed to the Council are of the form that can be asked simply by saying yes or no, as in 'Must we go to war against the Bactrians?' or 'Should taxes be raised?'. To every such question, the Council will be divided into those philosophers answering affirmatively and those whose reply is negative. The Philosopher Rule must then decide whether the final answer is yes or no based on the distribution of individual replies across the Council. The Rule will be complete if it can take care of every possible answer distribution."

"Now, as philosophers pride themselves with having an answer to everything, we know that each of the Council members will either reply yes or no to every conceivable query they might be confronted with, that is, they never remain silent at a question. So, to a given question the group of negative philosophers is exactly the complementary of the group of affirmative philosophers, and we can in a sense concentrate only on the affirmative party. The Philosopher Rule can be then regarded as playing the following role: for each group of affirmative philosophers the Rule must decree whether the group is authoritative, that is, whether the final judgment on a question must be taken to be affirmative if it is this precise group of philosophers that reply in the positive."

"We cannot choose at random which groups are authoritative, because such disposition would likely yield inconsistent results when the Council was consulted over time. Clearly, some constraints must apply to the way in which we choose the list of authoritative groups. I think the art of Logic will help us here, fortunately I brought with me some books on the subject."

That was how far the mathematician arrived the first day. The second day, al-Khwārizmī ordered a fresh load of paper and sat on the veranda from dusk till dawn furiously scribbling symbols and consulting a heap of books by the school of the Stoics. When the last light of day vanished beyond the palace walls and the first crickets began chirping in the garden, the mathematician arose from his seat with a triumphal smile, as he thought he had the foundations of the problem basically laid down, and only some routinely calculations were necessary to reach a solution. This is a summary of his conclusions that day:

"We explore the constraints imposed on eligible authoritative groups by studying their connection with the basic laws of Logic by which every wise man must abide. For it is our goal that the decisions taken by the Rule be free of contradiction much as those of any philosopher are bound to be."

• "If the entire Council supports a decision (though according to the King this event has not ever happened), it is only sensible for the Rule to approve the decision as well. So, the group formed by all philosophers in the Council is an authoritative group."
• "If a group A is deemed authoritative, then the complementary group, i.e. that comprising the philosophers outside A, cannot be authoritative. Otherwise, if we asked a question which is supported by A and then submitted exactly the opposite question, which would be supported by the complementary of A, the Rule would end up holding both a position and its negation."
• If a certain group A is considered authoritative, so will be the case with any other group including A: having more philosophers supporting the decision can only make us more confident on the result.
• "Let us suppose we pose a question, like 'Should we raise taxes?' to which an authoritative group A responds affirmatively, and then some other question, like 'Should we offer a sacrifice to the gods?', which is supported by another authoritative group B. If we had submitted the combined question 'Should we raise taxes and offer a sacrifice?' the laws of Logic teach us that the philosophers answering positively would be exactly those belonging to both A and B. Hence the Rule must have the intersection of authoritative groups as authoritative."

"The only work left to do is finding a list of authoritative groups that satisfy the restrictions. This I will do tomorrow through some fairly easy if tedious calculations."

Al-Khwārizmī had a light dinner and went to sleep with the peace of mind enjoyed by those who feel success within reach of their hand.

## Friday, November 2, 2007

### λ-fun in metaC++

Let's embed untyped lambda calculus into the metaprogramming sublanguage of C++. Such embedding consists of two components:
• Syntax: a representation for λ-terms.
• Semantics: a mechanism for β-reduction.
Though in our approach both aspects are inextricably intertwinned: lambda abstraction (the operation that goes from M to λx.M) and its associated β-reduction rule are described operationally rather than syntactically.

A λ-term is represented by one of the following:

• An unanalyzed C++ type,
• A type of the form `apply<F,G>` for some types `F`,`G`,
where the metafunction `apply` is defined as follows:
`template<typename F,typename G>struct apply{typedef apply type;};`

(Note that apply resolves to itself, i.e. apply<F,G>::type is apply<F,G>. The reason for this will be apparent later. )

For instance, if we have types `F`, `G` and `H` standing for λ-terms F, G and H, the derived term
FH(GH)
is represented by the type
`apply<apply<F,H>,apply<G,H> >`
As we commented before, there is no syntactic facility for expressing lambda abstraction; instead, if a type is to represent an abstraction term we define its behavior under β-reduction by specializing `apply`. For instance, given the term
I = λx.x
we can represent it like this:
`struct I;template<typename X>struct apply<I,X>{typedef X type;};`
That is, we partially specialize apply so that `apply<I,F>::type` represents the β-reduction of (λx.x)f, i.e. f. This manner of handling lambda abstraction is particularly appealing in that it dispenses with α-reduction of bound variables by way of simply not representing bound variables at all (their role is played by template parameters).

More examples of representing lambda abstraction:

K = λxy.x
`struct K;template<typename X,typename Y>struct apply<apply<K,X>,Y>{typedef X type;};`
K* = λxy.y
`struct KSTAR;template<typename X,typename Y>struct apply<apply<KSTAR,X>,Y>{typedef Y type;};`

S = λxyz.xz(yz)

`struct S;template<typename X,typename Y,typename Z>struct apply<apply<apply<S,X>,Y>,Z>{ typedef typename apply<  typename apply<X,Z>::type,  typename apply<Y,Z>::type >::type type;};`
Note how currying is used in the specializations of `apply` for binary K and K* and ternary S. Another important aspect is that the application operation for S is deep in the sense that it not only composes the term XZ(YZ), as the following formulation would have done:
`// lazy formulationtemplate<typename X,typename Y,typename Z>struct apply<apply<apply<S,X>,Y>,Z>{ typedef typename apply<   typename apply<X,Z>,   typename apply<Y,Z> > type;};`
but it also executes XZ(YZ) by invoking the nested `::type` of each of the `apply` components. This amounts to implementing full β-reduction instead of lazy one-step reduction. In general, full reduction is assumed, that is, each λ-term represented in our embedding mechanism is actually in β-normal form. This rules out terms without β-nf as Ω:

ω = λx.xx
Ω = ωω

`struct omega;template<typename X>struct apply<omega,X>{typedef typename apply<X,X>::type type;};typedef apply<omega,omega>::type OMEGA; // compile-time error`
Now, when a type X is "primitive" in the sense that it does not have any associated specialization of `apply`, `apply<X,Y>::type` evaluates to `apply` itself. So, if we have some term `F` and apply it to a succesion of primitive types `X`, `Y`, `Z,`... until no more active reductions remain, what we obtain is a syntactic description of the β-nf of `F` in the form of a composite `apply` expression with `X`, `Y,`... as terminals. For instance:

Sβ λxyz.xz(yz)

`apply<apply<apply<S,X>::type,Y>::type,Z>::type` evaluates to
`apply<apply<X,Z>,apply<Y,Z> >`

KIβ λxy.y (= K*)

`apply<apply<apply<K,I>::type,X>::type,Y>::type` evaluates to `Y`
SKKβ λx.x (= I)
`apply<apply<apply<S,K>::type,K>::type,X>::type` evaluates to `X`
This technique can be used to determine whether two λ-terms (as represented in our C++ embedding) are β-convertible.

In later entries we will play some more with our lambda calculus embedding.