• Sorted by Date • Classified by Publication Type • Classified by Topic • Grouped by Student (current) • Grouped by Former Students •
Dan Erusalimchik and Gal A. Kaminka.
Towards Adaptive Multi-Robot Coordination Based on Resource Expenditure Velocity: Extended Version. Technical Report
MAVERICK 2008/02, Bar Ilan University, Computer Science Department, MAVERICK Group, available at http://www.cs.biu.ac.il/$^\sim$galk/Publications/,
2008.
In the research area of multi-robot systems, several researchers have reported on consistent success in using heuristic measures to improve loose coordination in teams, by minimizing coordination costs using various heuristic techniques. While these heuristic methods has proven successful in several domains, they have never been formalized, nor have they been put in context of existing work on adaptation and learning. As a result, the conditions for their use remain unknown.We posit that in fact all of these different heuristic methods are instances of reinforcement learning in a one-stage MDP game, with the specific heuristic functions used as rewards. We show that a specific reward function---which we call Effectiveness Index (EI)---is an appropriate reward function for learning to select between coordination methods. EI estimates the resource-spending velocity by a coordination algorithm, and allows minimization of this velocity using familiar reinforcement learning algorithms (in our case, Q-learning in one-stage MDP).The paper analytically and empirically argues for the use of EI by proving that under certain conditions, maximizing this reward leads to greater utility in the task. We report on initial experiments that demonstrate that EI indeed overcomes limitations in previous work, and outperforms it in different cases.
@techreport{ei08dan, author = {Dan Erusalimchik and Gal A. Kaminka}, title = {Towards Adaptive Multi-Robot Coordination Based on Resource Expenditure Velocity: Extended Version}, year = {2008}, number = {MAVERICK 2008/02}, institution = {Bar Ilan University, Computer Science Department, {MAVERICK} Group, available at http://www.cs.biu.ac.il/$^\sim$galk/Publications/}, wwwnote = {}, abstract = { In the research area of multi-robot systems, several researchers have reported on consistent success in using heuristic measures to improve loose coordination in teams, by minimizing coordination costs using various heuristic techniques. While these heuristic methods has proven successful in several domains, they have never been formalized, nor have they been put in context of existing work on adaptation and learning. As a result, the conditions for their use remain unknown. We posit that in fact all of these different heuristic methods are instances of reinforcement learning in a one-stage MDP game, with the specific heuristic functions used as rewards. We show that a specific reward function---which we call \emph{Effectiveness Index} (EI)---is an appropriate reward function for learning to select between coordination methods. EI estimates the \emph{resource-spending velocity} by a coordination algorithm, and allows minimization of this velocity using familiar reinforcement learning algorithms (in our case, Q-learning in one-stage MDP). The paper analytically and empirically argues for the use of EI by proving that under certain conditions, maximizing this reward leads to greater utility in the task. We report on initial experiments that demonstrate that EI indeed overcomes limitations in previous work, and outperforms it in different cases. }, }
Generated by bib2html.pl (written by Patrick Riley ) on Fri Aug 30, 2024 17:29:51