First published 2 July 2012
Rewarding Reactivity to Evolve Robust Controllers without Multiple Trials or Noise
Joel Lehman, Sebastian Risi, David B. D'Ambrosio, Kenneth O. Stanley
Behaviors evolved in simulation are often not robust to variations of their original training environment. Thus often researchers must train explicitly to encourage such robustness. Traditional methods of training for robustness typically apply multiple non-deterministic evaluations with carefully modeled noisy distributions for sensors and effectors. In practice, such training is often computationally expensive and requires crafting accurate models. Taking inspiration from nature, where animals react appropriately to encountered stimuli, this paper introduces a measure called reactivity, i.e. the tendency to seek and react to changes in environmental input, that is applicable in single deterministic trials and can encourage robustness without exposure to noise. The measure is tested in four different maze navigation tasks, where training with reactivity proves more robust than training without noise, and equally or more robust than training with noise when testing with moderate noise levels. In this way, the results demonstrate the counterintuitive fact that sometimes training with no exposure to noise at all can evolve individuals significantly more robust to noise than by explicitly training with noise. The conclusion is that training for reactivity may often be a computationally more efficient means to encouraging robustness in evolved behaviors.