Bannalia: trivial notes on themes diverse: Making choices: adaptive strategy

We revisit the problem of selecting one among N candidates based on the evaluation of M independent normally distributed random candidate traits, with the constraint that at most N traits (out of the total NM) can be evaluated: our goal is to improve the simple and two-stage strategies we had previously analyzed.

The idea behind an adaptive strategy is to make at each step the decision that maximizes the expected result given the information obtained so far. We study this approach for a simplified scenario: suppose we have only three candidates c₁, c₂, c₃ with (unkown to the strategy) scores s₁, s₂, s₃ and current partial scores (known to the strategy) s'₁ > s'₂ > s'₃, respectively. In this situation, the best strategy is to choose c₁, and the average result is precisely s'₁, since s₁ − s'₁ is a normal distribution with mean value 0. Now, if we are allowed to evaluate one more trait, and the three candidates have still unevaluated traits, how to best spend this extra resource? Let us examine the result for each possible decision:

Evaluate c₁. Let us call t₁ the newly evaluated trait of c₁. If s'₁ + t₁ > s'₂, we stick to c₁; otherwise, we change our selection to c₂. The average score we obtain can then be calculated as

S₁ = (s'₁ + E(t₁ | t₁ > s'₂ − s'₁))·P(t₁ > s'₂ − s'₁) + s₂·P(t₁ < s'₂ − s'₁),

where t₁ follows a normal distribution with mean 0 and variance 1. Note that the unevaluated traits do not affect the expected score because their mean is 0.

Evaluate c₂. If we call t₂ the extra trait evaluated for c₂, the analysis is similar to that of the previous case and yields

S₂ = s'₁·P(t₂ < s'₁ − s'₂) + (s'₂ + E(t₂ | t₂ > s'₁ − s'₂))·P(t₂ > s'₁ − s'₂).

Using the fact that the statistical distribution of t₂ is identical to that of t₁ and the symmetry of this distribution around 0, elementary manipulations on the expressions above lead us to conclude that

S₁ = S₂.

Evaluate c₃. Calling t₃ the extra trait evaluated for c₃, we have

S₃ = s'₁·P(t₃ < s'₁ − s'₃) + (s'₃ + E(t₃ | t₃ > s'₁ − s'₃))·P(t₃ > s'₁ − s'₃).

Again, t₃ has the same statistical distribution as t₂; but since s'₃ < s'₂ it is not hard to prove that

S₃ < S₂.

So, the most efficient way to spend the extra trait evaluation is using it with either c₁ or c₂. This result readily extends to the general case:

adaptive strategy(c₁,...,c_N)
{s holds the partial scores of c₁,...,c_N}
s ← {0,...0}
for i = 1 to N
····select some c_j with unevaluated traits and maximum s_j
····evaluate new trait t on c_j
····s_j ← s_j + t
end for
select a candidate c_k with maximum s_k

This extremely simple strategy has been implemented in C++ (Boost is needed to compile the program) and exercised for the same settings as in previous cases, N = 60, M = 5. The figure shows the performance of the adaptive strategy compared with the best simple and two-stage strategies and the optimum case (which is not attainable as it requires knowledge of all the NM traits of the candidate pool):

Note that we have not proved that this adaptive strategy is optimum, i.e. a strategy with the best possible expected score. That would be the case if the following optimal substructure property held true: if S is an optimum strategy for n evaluation traits and a strategy S' is identical to S except in that it omits the last trait evaluation, then S' is an optimum strategy for n − 1 evaluation traits.

Bannalia: trivial notes on themes diverse

Saturday, June 21, 2008

Making choices: adaptive strategy

No comments :

Post a Comment