Bannalia: trivial notes on themes diverse: December 2007

Friday, December 28, 2007

I-mmortality

When interacting in the web, are we fully aware that all of our utterances, articles we wrote, posts we sent, pictures we published, rants, jokes, praises, mistakes, sins, will likely be navigating the humming vortices of the net for ever after, being archived and replicated in servers of ever-increasing capacity, for coming generations to occasionally stumble upon and bring to the surface like mosquitoes trapped in amber?

In Yourcenar's Memoirs of Hadrian, the Emperor comments on the inscriptions scratched by casual visitors all over the stony legs of the Colossus of Memnon; one of them was done by some Eumenes six centuries before the time of Hadrian, and that impulsive sign is all that remains of its author for the eternity. Up to our days, the fraction of humankind that has left some sort of personal imprint (fossil bones, handcraft, inscriptions, intellectual work) after their death is vanishingly small. With the advent of the Internet and our progressive immersion into digital life, this fraction could easily rise to 100% in a few decades, providing every one of us with a cheap and unexpected form of immortality.

Monday, December 24, 2007

λ-fun in metaC++: extensions and term rewriting

So-called λδ-calculi involve the addition to λ-calculus of some set C of external terms along with δ rules describing the reductions by which terms involving entities of C behave, additionally to those of standard λ-calculus. For instance, instead of relying on the cumbersome and computationally expensive Church numerals we can embed natural numbers directly:

C = {succ,pred,+,*,iszero}+{0,1,2,...}.
succ n →_δ (n+1) for any natural n.
pred n →_δ (n-1) for any natural n>0.
+ n m →_δ (n+m) for any natural n, m.
* n m →_δ (nm) for any natural n, m.
iszero 0 →_δ TRUE, iszero n →_δ FALSE for natural n>0.

Many nice properties of λ-calculus, like the Church-Rosser Property, are retained in λδ-calculi.

Our translation of λ-calculus to metaC++ allows us to easily accomodate δ extensions: we only have to supply δ-reductions as additional specializations of the apply metafunction:

template <int N> struct int_;

struct succ;
template<int N>
struct apply<succ,int_<N> >{typedef int_<N+1> type;};

struct pred;
template<int N>
struct apply<pred,int_<N> >{typedef int_<N-1> type;};

struct plus;
template<int M,int N>
struct apply<apply<plus,int_<M> >,int_<N> >{typedef int_<M+N> type;};

struct mult;
template<int M,int N>
struct apply<apply<mult,int_<M> >,int_<N> >{typedef int_<M*N> type;};

struct iszero;
template<int N>
struct apply<iszero,int_<N> >{typedef FALSE type;};
template<>
struct apply<iszero,int_<0> >{typedef TRUE type;};

Our former FAC function using Church numerals can be adapted to work with these by simply replacing the original set of numerical operations and constants with the new ones:

struct FAC;

struct FACIF;
template<typename N>
struct apply<FACIF,N>{typedef int_<1> type;};

struct FACELSE;
template<typename N>
struct apply<FACELSE,N>:
apply2<
 mult,N,
 typename apply<FAC,typename apply<pred,N>::type>::type
>
{};

template<typename N>
struct apply<FAC,N>:
apply<
 typename apply3<
   IFTHENELSE,
   typename apply<iszero,N>::type,
   FACIF,
   FACELSE
 >::type,
 N
>
{};

The complete program is provided.

Why it was so easy to accomodate δ-reductions in our C++ rendition of λ-calculus? Well, a basic premise of the embedding procedure is that there is no explicit representation for λ-abstractions and these are rather described operationally according to their reduction behavior: what we really have is a term rewriting engine, which is what C++ template partial specialization ultimately boils down to, and what δ-reduction is all about.

In the end λ-calculus itself is nothing but a kind of Term Rewriting System, where allowable rewritings are described in terms of λ-abstractions. In a way, the theory of TRSs captures the basic essence of λ-calculus and allows for a more general treatment of the subject.

Thursday, December 20, 2007

Within 50 km: correct sources

Thanks to the indications from a reader of the "Within 50 km" entry, we have now more authoritative sources regarding the commitments by the Spanish goverment on the evolution of the high-speed reailroad network. According to these sources, the plans are that

(R₁) all (continental) province capitals will be directly connected to the network,
(R₂) 90% of the population will live within 50 km of an AVE station

by 2020, and not 2010 as I incorrectly had assumed. In 2020 previsions are that the network length will reach 10,000 km, which is far more than the 2,230 km planned for 2010, and certainly more than enough to satisfy the coverage requirements stated above, as we see later. These figures are related to the so called Infrastructures and Transport Strategic Plan (PEIT) for the 2005-2020 period. The development of PEIT implies that the high-speed network will grow at the impresive rate of 777 km per year during 2010-2020. Whether this is feasible looks questionable to me, but the nature of my skepticism is not mathematical, so I leave it here.

Just for the fun of it, let us estimate the minimum network lengths needed to satisfy requirements R₁ and R₂. With respect to R₁, we can use again the genetic program written for our previous entry on this issue, with only some changes to the initial parameters needed. After several hundred iterations we obtain an estimation for the minimum length of L₁ = 3,870 km approximately. As for R₂, we have, according to the National Statistics Institute (INE) data for the municipal register as of January 1st, 2006, that the Spanish population was 44,708,964. What is the minimum area the railroad network should cover so as to contain 90% of this population, i.e. 40,238,068 people? Francisco Ruiz provides an amazing resource at his website, a single Excel file including a comprehensive list of Spanish municipalities, along with their populations and areas, according to INE 2006 data: using this, we need only sort the (continental) municipalities by descending population density and look up the aggregated surface of the n first entries totalling up 40,238,068. This gives us an area A = 262,039 km²; repeating our original analysis with this value of A yields a minimum length L₂ = 3,464 km. Anyway, this estimation is very imprecise as the analysis assumes the area to cover to be of compact shape, while in this case the shape is expected to be elongated along the Spanish coast with isolated dots over inland larger cities.

To satisfy both R₁ and R₂ we would need a minimum length between max{L₁,L₂} and L₁ + L₂, which in any case is well below 10,000 km.

Sunday, December 16, 2007

A paradoxical three-person game: numerical solution

The general form of the N-option Mediocrity game can be depicted as follows:

1	2	3	4	5	6	...	3N-2	3N-1	3N
A	B	C	A	B	C	...	A	B	C

Let us assume A's optimum strategy is of mixed type:

S_A = a₁(A←1) + a₂(A←4) + ··· + a_N(A←3N-2),

with 0 ≤ a_i ≤ 1, ∑a_i = 1 (a pure strategy is just a particular case where some a_i = 1). The critical point in our analysis is the obervation that the roles of A and C are perfectly symmetrical, and thus C's optimum strategy must be the following:

S_C = a₁(C←3N) + a₂(C←3N-3) + ··· + a_N(C←3).

This interdependency of A's and C's strategies allows us to regard the game as a two-person one: B against the "combined player" A+C. If the strategy adopted by B is written down as

S_B = b₁(B←2) + b₂(B←5) + ··· + b_N(B←3N-1),

with 0 ≤ b_i ≤ 1, ∑b_i = 1, then B's payoff is

U_B = ∑b_i(P(A<i)P(i<C) + P(i<A)P(C<i))=
∑b_i((a₁+···+a_i)(1−a_N−i+2−···−a_N) + (1−a₁−···−a_i)(a_N−i+2+···+a_N)),

where a_N−i+2+···+a_N is by convention taken to be zero when i = 1. We abbreviate this expression as

U_B = ∑b_is_i.

Note that s_i = s_N-i+1.

Case 1: N even. A strategy (a₁,...,a_N) giving s_i = 0 for i=1,...,N is clearly optimum (for A+C). We show that such strategy exists and is unique by solving the set of N/2 equations:

s₁ = a₁ = 0,
s₂ = (a₁+a₂)(1−a_N) + (1−a₁−a₂)a_N =0,
...
s_N/2 = (a₁+···+a_N/2)(1−a_(N/2)+2−···−a_N)+(1−a₁−···−a_N/2)(a_(N/2)+2+···+a_N) = 0.

The first equation implies a₁ = 0, so that the second equation reduces to a₂(1−a_N)+(1−a₂)a_N = 0, from which it follows that a₂ = a_N = 0. Going down the rest of equations in a similar way we conclude that every a_i except a_(N/2)+1 must be zero. Hence the unique optimum strategy for A+C has

a_(N/2)+1 = 1,
a_i = 0 otherwise,

which translates to

S_A= A←(3N/2)+1,
S_C= C←3N/2.

As U_B = 0, B's strategy is immaterial and can be identified with

S_B = (1/N)((B←2) + ··· + (B←3N-1)).

The winner is A or C depending on which side of A and C consecutive positions B plays. Their payoffs are U_A = U_C = 1/2.

Case 2: N odd, N ≥ 3. Let (a₁,...,a_N) be an arbitrary strategy for A and s₁,...,s_N its associated coefficients. We define a derived strategy (a'₁,...,a'_N) in the following manner:

a'_(N+1)/2 = a₁ + ··· + a_(N+1)/2,
a'_(N+3)/2 = a_(N+3)/2 + ··· + a_N,
a'_i = 0 otherwise;

Then the associated coefficients s'_i verify the following:

s'_(N+1)/2 = (a'₁ + ··· + a'_(N+1)/2)²+ (1 − a'₁ − ··· − a'_(N+1)/2)² =
= (a'_(N+1)/2)²+ (1 − a'_(N+1)/2)² =
= (a₁ + ··· + a_(N+1)/2)²+ (1 − a₁ − ··· − a_(N+1)/2)² =
= s_(N+1)/2.
For i < (N+1)/2,
s'_i = (a'₁ + ··· + a'_i)(1 − a'_N−i+2 − ··· − a'_N) +
+ (1 − a'₁ − ··· − a'_i)(a'_N−i+2 + ··· + a'_N) =
= 0(1-0) + (1-0)0 = 0, as all the a'_i coefficients involved are other than a'_(N+1)/2 and a'_(N+3)/2.
For i > (N+1)/2, s'_i = s'_N−i+1 = 0, since N−i+1 < (N+1)/2.

So, for any strategy for B of the form (b₁,...,b_N) we have ∑b_is_i ≤ b_(N+1)/2s_(N+1)/2 = ∑b_is'_i, the inequality being strict if any of a₁,..., a_(N−1)/2,a_(N+5)/2,...,a_N is different from zero, hence we conclude that an optimum strategy for A must have all its coefficients set to zero except those at positions (N+1)/2 and (N+3)/2. We are then left with the task of finding positive values of a_(N+1)/2 and a_(N+3)/2 adding to 1 and minimizing the maximum of the expression

U_B = b_(N+1)/2s_(N+1)/2 = b_(N+1)/2((a_(N+1)/2)² + (1 − a_(N+1)/2)²),

which is obviously reached at b_(N+1)/2 = 1. The solution to this trivial problem is a_(N+1)/2 = a_(N+3)/2 = 1/2. So, the optimum strategies for each player are:

S_A= (1/2)(A←(3N−1)/2) + (1/2)(A←(3N+5)/2),
S_B= B←(3N+1)/2,
S_C = (1/2)(B←(3N−3)/2) + (1/2)(B←(3N+3)/2),

yielding payoffs U_A = U_C = 1/4, U_B = 1/2.

Although the numerical solution of the N-option Mediocrity game seems entirely satisfactory, the argument stating that optimum strategies for A and C must be symmetrical induces, for N odd, N ≥ 3, some psychological difficulties akin to those arising in the analysis of the Prisoner's dilemma. We will examine this issue in a later entry.

Wednesday, December 12, 2007

Within 50 km, revisited

Commenting on the "Within 50 km" entry blog, a friend of mine remarks that the commitment by the Spanish government concerning the coverage of the high-speed railroad network has been probably misreported by the media, and it might not apply to every town in the country, but only to most important cities. So, let us restate such commitment in a plausible way:

Spanish high-speed railroad network will be developed so that every continental province capital will lie within 50 km of a station.

(Please note that I have no source confirming/denying whether this is really the government's statement.) This is a rather weaker commitment, yet the question remains open whether it can be achieved with a network whose total length will be 2,230 km by 2010, according to the government itself.

Let us consider the 47 continental Spanish province capitals and investigate the minimum length of a network spanning the country with the property that each of the 47 cities lies within 50 km of a network node (station) --we will call this the coverage property. It is easy to see that the network has to be a Steiner tree. We will mathematically prove in a separate entry that the solutions to the problem are in fact Steiner trees constructed out of 47 base points, each lying on the circle with radius 50 km centered at the i-th city. A basic theorem about Steiner trees tells us that there can be no more than 47−2 additional Steiner points arising on the construction of the tree; this allows us to regard the solution network as a minimum spanning tree on 47+45=92 nodes. If there happens to be a solution with less than 92 nodes, we still have one with exactly 92 nodes: just add the remaining points arbitrarily along the original network. Summing up, we can formulate our problem as that of finding meshes of n=92 nodes with the coverage property whose minimum spanning tree length is minimum.

I have implemented a genetic algorithm to solve the problem, along the following lines:

Each candidate solution (or individual) of the pool is merely a set of n points. The fitness of an individual is the complementary of its "length", which is calculated as the length of its associated minimum spanning tree, with an additional penalty for each uncovered city (i.e. each city lying more than 50 km away from every point of the solution).
Initialization of each individual is implemented with a mix of "base points" lying at 50 km from some city and "free points" resulting from the weighted sum of two randomly selected cities. This mixed strategy ensures that the initial population has a decent proportion of individuals satisfying the coverage property.
On each generation, the 25% fittest individuals are selected for survival and breeding.
Crossover of two individuals a and b is implemented as follows: A city c is selected at random and the points of a and b are sorted according to their distance to c. The child individual is built with the first m points of a and the n−m last points of b, for some random m between 0 and n. The points of the child individual can also be randomly mutated: a mutated point results from the weighted sum of the original and some randomly selected city.

I wrote a C++ program implementing this algorithm (you will need Boost to compile it): the program keeps iterating the population until aborted by the user, and on each iteration it outputs the minimum network length achieved so far and a SVG picture of the best candidate solution. Coordinates of province capitals have been obtained from the Spanish Ministry of the Interior, and translated to UTM (within the same UTM zone even if this incurs some curvature error) with Gabriel Ortiz's coordinate converter. I ran the program for hours, playing a bit with the initial parameters, and each run stabilized at a minimum length above 2,500 km. The figure below shows the best solution I got, with 2,535 km in length, obtained with a population size of 5,000 and a mutation rate of 0.01 after 4,000 iterations. You can observe a characteristic feature of Steiner trees by which confluent edges never form an angle smaller than 120º.

Take into account that in our rendering of the problem we have made some simplifying assumptions that in general underestimate the network length needed to satisfy the coverage commitment in the real world. For instance:

Railroad segments between nodes are always assumed to be straight lines.
We have laid out the network from scratch, without considering that part of it (Madrid-Sevilla, Madrid-Barcelona, Madrid-Lisbon, etc.) is already in place or planned, which, by definition, increases the length of the optimum compatible solution we can build from there.

So, unless we have made some mistake in our analysis, it is impossible to extend the high-speed railroad network to within 50 km of every province capital with only 2,230 km as planned by 2010.

Saturday, December 8, 2007

Goal crossbar cross-sections

In the game of soccer, it is not infrequent that the ball hits the goal crossbar: sometimes it bounces back into the field or out over the bar, and sometimes one's lucky and the ball enters the goal.

The result depends on the point of the crossbar where the hit occurs and the incidence angle of the ball. The question arises: how does the bar cross-section affect the chances that the ball bounces into the goal? Although goals at professional fields seem to always have crossbars with circular cross-section, the FIFA Laws of the Game allow for other shapes:

Goalposts and crossbars must be made of wood, metal or other approved material. Their shape may be square, rectangular, round or elliptical and they must not be dangerous to players.

The geometry involved is simple: Let α be the incidence angle of the ball and η the inclination of the crossbar surface at the hitting point. As depicted in the diagram, negative values of α would mean that the ball follows an ascending trajectory.

The reflection angle γ, measured clockwise from 12 o'clock, can be easily calculated:

γ = π/2 + 2η + α.

The ball enters the goal when γ > π, hence the minimum incidence angle for which the ball enters the goal at a particular hitting point is

α_goal = π/2 − 2η.

We now traverse the crossbar section in polar coordinates (ρ,θ), with θ growing clockwise from 3 o'clock, and determine the incidence angle η as a function of θ.

For a circular cross-section, η = θ; if the section is an ellipse with eccentricity ε then η = arctan((1−ε²)⁻¹ tan θ), whereas for a rectangle η(θ) consists of two constant segments corresponding to the the vertical and horizontal edges.

The analysis so far considers an idealized ball with null radius. This approximation is grossly incorrect in the particular case of a rectangular cross-section, as it fails to model the situation where the ball hits on an edge of the crossbar rather than a flat face. An easy trick to cope with a ball radius r > 0 is to consider the extended cross-section consisting of those points within distance r of the original section:

The extended cross-section for a circle is also a circle, and for an ellipse is something not very different from the original shape (the resulting "eccentricity" is smaller). In the case of a rectangle the abrupt transition between the two values of η(θ) is replaced by a slope which is less steep as the ratio between r and the crossbar width grows; for a square section and considering the typical dimensions of balls and crossbars, the slope value lies around 1.7.

We are now ready to depict α_goal as a function of the hitting angle θ and interpret the results:

For each hitting angle θ, α_goal is the minimun incidence that the ball trajectory has to have to enter the goal, measuring the incidence angle from the horizontal line downwards. So, lower values of α_goal means that it is easier to enter the goal. For instance, when the hit occurs at low values of θ (front part of the crossbar) the ball has to follow an almost vertical descending trajectory if it is to enter the goal, whereas for high values of θ (bottom part of the bar) even ascending trajectories can bounce into the goal. We see that an elliptical cross-section is always more goal-friendly than a circular section. As for the square section, it is worse (less goal-friendly) than the ellipse for low values of θ and better for a smaller region of high values of θ.

So, what cross-section shape should we choose if we want to facilitate goal scoring? Circles are ruled out, and to decide between an elliptical shape or a square we would have to know which hitting zone is more likely. Although I can't justify it, my hunch is that most hits against the crossbar occur at the front part rather than at the bottom, and moreover the interval of values of θ for which the ellipse wins is larger than the winning interval for the square (the intersection point of both η functions happen at some θ₀ > π/4). So, it seems like we should go for an elliptical cross-section, the more eccentric (flatter) the better.

Tuesday, December 4, 2007

λ-fun in metaC++: recursion

Let's implement a recursive numeric function in our C++ embedding of λ-calculus. Before delving into the specific problems of recursion, we remember the standard way of defining basic logical predicates in λ-calculus, as usual courtesy of Alonzo Church:

TRUE = λxy.x
FALSE = λxy.y
AND = λxpq.p q FALSE
OR = λpq.p TRUE q
NOT = λp.p FALSE TRUE
IFTHENELSE = λpfg.pfg

whose translation to metaC++ is straightforward. We will also need a predecessor function (we define the succesor in prior entries) and a predicate for testing whether a number is zero (where numbers have been defined via Church numerals):

PRED = λnfx.n(λgh.h(gf))(λu.x)(λu.u)
ISZERO = λn.n(λx.FALSE)TRUE

So equipped, we could try to "define" the factorial function as follows:

FAC = λn.IFTHENELSE (ISZERO n) 1 (MULT n (FAC(PRED n)))

which is not a true definition as FAC is both the definiendum and part of the definiens. Such a recursive definition is typically emulated in λ-calculus by parameterizing the recursive call:

FACIMPL = λfn.IFTHENELSE (ISZERO n) 1 (MULT n (f(PRED n)))

and adding the folding part by means of some fixed-point operator FP:

FAC = FP FACIMPL

where FP has the property

FP FACIMPL →_β FACIMPL(FP FACIMPL)

so that reducing FP FACIMPL one step "executes" FACIMPL once on the very same FP FACIMPL function:

FP FACIMPL →_β
FACIMPL(FP FACIMPL) →_β
FACIMPL(FACIMPL(FP FACIMPL)) →_β ...

There are many (infinite, actually) universal fixed point operators, i.e. λ-terms that are a fixed-point operator of any arbitrary term. The most famous is the Y combinator:

Y = λf.(λx. f(xx))(λx. f(xx))

Proving that Y is a universal fixed-point operator is trivial:

λf.(λx. f(xx))(λx. f(xx)) →_β λf.f((λx. f(xx))(λx. f(xx))) = λf.f(Yf)

At first sight, it looks like we can easily code Y in C++:

template<typename F> struct YAUX; template<typename F,typename X> struct apply<YAUX<F>,X>:apply<F,typename apply<X,X>::type>{}; struct Y; template<typename F> struct apply<Y,F>:apply<YAUX<F>,YAUX<F> >{};

but we will obtain a compiler error as soon as we try to use this:

// error 'type' not defined or something similar typedef apply<Y,I>::type type;

What's happening? The compiler tries to instantiate apply<YAUX<I>,YAUX<I> >, which is self-referential because it derives from apply<F,apply<YAUX<I>,YAUX<I> >::type>. In short, YI has an infinite β-reduction path, which is after all understandable as this is the raison d'être of a fixed-point operator.

The good news is that we can in fact implement recursive functions in our C++ embedding of λ-calculus without using fixed-point operators: Defining a term f translates to specializing apply for some contexts of use of the type F, and in defining this specialization we can freely use the type F itself, which opens up the possibility of recursion. The following is a definition of the factorial function:

struct FAC;

struct FACIF;
template<typename N>
struct apply<FACIF,N>{typedef ONE type;};

struct FACELSE;
template<typename N>
struct apply<FACELSE,N>:
apply2<
 MULT,N,
 typename apply<FAC,typename apply<PRED,N>::type>::type
>
{};

template<typename N>
struct apply<FAC,N>:
apply<
 typename apply3<
   IFTHENELSE,
    typename apply<ISZERO,N>::type,
    FACIF,
    FACELSE
 >::type,
 N
>
{};

where FACIF and FACELSE provide a level of indirection to avoid infinite β-reduction inside the expansion of FAC. In λ-calculus jargon, this trick corresponds to the so-called η-expansion of the original term.

A program with the full code for this entry is provided.

Subscribe to: Posts ( Atom )