## Monday, June 9, 2008

### Making choices: two-stage strategies

Let us consider more sophisticated selection strategies than those studied in a prior entry for the problem of selecting one among N candidates based on the evaluation of M independent normally distributed random candidate traits, with the constraint that at most N traits (out of the total NM) can be evaluated.

We consider the following two-stage scheme: first sort n1 randomly chosen candidates according to the first t1 traits; from these, pick the n2 best ones (n2 < n1) and make the choice among these based on their first t2 traits (t2 > t1). The idea is that making a preselection on fewer traits improves the quality of the candidates used at the final stage, although of course preselection incurs a cost in terms of trait evaluations consumed. The kind of selection strategies studied in our prior entry can be considered a degenerate instance of this with t1 = 0. Leaving this case aside, t1 can range between 1 and M − 1 and t2 between t1 + 1 and M. The acceptable values for n1 and n2 are governed by the following inequalities:

2 n1,
1 ≤ n2n1 − 1,
t
1n1 + (t2t1)n2N.

(If you wonder why the last inequality has (t2t1)n2 instead of t2n2, notice that we need not re-evaluate the first t1 traits in the secound round.) We are only interested in maximally efficient strategies that evaluate as many traits as allowed, i.e. for which t1n1 + (t2t1)n2 is as close as possible to N. If n1 and n1 could take fractional values, the points (n1,n2) associated to maximally efficient strategies would lie in the segment

( ((N + (t2t1))/t2 , (Nt1)/t2) , ((N − (t2t1))/t1,1) ).

The actual integer pairs (n1,n2) are obtained by approximating this segment with discrete values. In the particular case we had previously studied for simple strategies, N = 60, M = 5, there are 124 maximally efficient two-stage strategies grouped like this:

 t1 t2 no of solutions 1 2 29 1 3 19 1 4 14 1 5 11 2 3 10 2 4 14 2 5 11 3 4 5 3 5 8 4 5 3

A simulation program has been run to measure the effectiveness of all these 124 strategies. The figure shows the average score of the chosen candidate for the best strategy of each (t1,t2) group; the legend of each bar is the tuple (t1,n1,t2,n2). Click on the image to enlarge. The most efficient two-stage strategy is (t1,n1,t2,n2) = (1,28,5,8): pick up 28 candidates at random, sort them accordingly to their first trait and then fully evaluate the first 8. The resulting average score 4.11 is notably higher than the average score 3.65 reached by the best one-stage strategy. These results seem to justify the common practice of conducting a preselection round in real life evaluation processes.

Although two-stage strategies improve upon previous one-stage schemes, they are by no means the most effective selection mechanism. We will try to develop better selection strategies in a later entry.