In the class of Prisoner

We find that for small λ and T values, the number of situations where trial-and-error learning (TR) subsists in the UNC 0631 is generally reduced. In the PD game, TR is no longer evolutionarily stable, but our qualitative results continue to hold for the other two games (e.g. Table A5). Lower values of λ and T generate an evolutionary pressure for learning speed, in which case hypothetical reinforcement learning (HR) is favoured in the PD game because an HR individual starts defecting when paired with a TR.