Improving Robot Behavior Optimization by Combining User Preferences

Abstract (Excerpt)

Recently it has been demonstrated that collaboration between automated algorithms and human users can be especially effective in robot behavior optimization tasks. In particular, we recently introduced a Fitness-based Search with Preference-based Policy Learning (FS-PPL) approach, in which the algorithm models the user based on her preferences and then uses the model, along with the fitness function, to guide search. However, so far only interaction between a single human user and an evolutionary algorithm was considered. If multiple users contribute preferences, the algorithm must determine whether to model them separately or jointly. In this paper we describe an algorithm in which one evolutionary algorithminteracts with two users and determines the best way to model them automatically. We test the algorithm with automated substitutes for human users and show that it performs better for two users working together than for the same users working separately, thus demonstrating the potential for crowdsourcing robot behavior optimization.