## Monday, May 12, 2008

### Common ancestors: simulation

We wish to study the kinship distribution of a closed population in the very simplified situation in which the population is closed, constant in size and stratified (split up in perfectly separate generations such that mating occurs only between members of the same generation). We choose to model the fertility distribution of the population r(n) as a Poisson distribution with median value 2 (since the population size is constant). Intersibling incest is banned, and for the sake of simplicity we also assume that a strict monogamy applies, i.e. siblings always share mother and father.

Before attempting to calculate the kinship distribution of such population analytically, I have written a small C++ program to estimate it via simulation. You will need Boost to compile the code. The simulation proceeds as follows:

• A pool of individuals is maintained, with new offspring being inserted after their parent generation.
• At each iteration, females of the last generation are traversed and mated with available males at random. The couple is assigned a number of children randomly generated according to a Poisson distribution with median 2.
• After a number of iterations the kinship distribution of the population is assumed to stabilize. The kinship histogram is calculated from a random sample of pairs of members of the last generation.

The figure shows the results for generation sizes N = 1,000, 10,000 and 100,000 with simulations of 100 generations each. We only depict values of K(n) for n even, since, because of the stratification property, K(n) = 0 if n is odd.

The kinship distribution K(n), n even, initially grows exponentially (in fact, as a·2n, as we will see when we derive analytical expressions for the function) until a maximum value between 0.4 and 0.5, from where it decays abruptly. The most probable kinship relationships between arbitrary individuals for N = 1,000, 10,000 and 100,000 are fifth, sixth and eighth cousins, respectively.

In a later entry we will try to determine K(n) analytically from basic probability theory.