@COMMENT This file was generated by bib2html.pl version 0.94 @COMMENT written by Patrick Riley @COMMENT This file came from Gal A. Kaminka's publication pages at @COMMENT http://www.cs.biu.ac.il/~galk/publications/ @mastersthesis{douchan-msc, author = {Yinon Douchan}, title = {Reinforcement Learning in Multi-Robot Swarms}, school = {{T}el {A}viv {U}niversity}, year = {2018}, OPTkey = {}, OPTtype = {}, OPTaddress = {}, OPTmonth = {}, OPTnote = {Available at \url{http://www.cs.biu.ac.il/~galk/Publications/b2hd-douchan-msc.html}}, OPTannote = {}, wwwnote = {}, abstract = {In multi-robot systems, robots cannot act without interactions and conflicts must be resolved. Such conflict is a spatial conflict; robots cannot share the same spot at the same time and must avoid collision. While there are many approaches to collision avoidance and resolution, every approach has its advantages and disadvantages. Recent promising work tries to address multi robot spatial coordination by adaptive selection of reactive coordination methods using an intrinsic reward function named the \emph{Effectiveness Index (EI)} which is based on the resource spending rate of the robots. While it has many desirable characteristics in terms of multi-robot coordination and shows some empirical success, its success is only limited and there are no theoretical guarantees of optimality or convergence, as we indeed show. The contributions of this work are in several areas: First, we start by listing gaps between existing theory and practice that rise in the context of reactive arbitration of coordination methods. Then, we give theoretical modelling and practical solutions in order to bridge those gaps. The theoretical modelling starts by representing a task run as an extensive form game. It then goes to creating a connection between the system-wide performance for a task run and the choices of each robot in each collision. Finally, it deals with the issue of how robots should act in order to achieve optimal system-wide performance. The practical solutions further bridge gaps that rise from running multi-robot systems in real-world applications. The last part of this work puts the theoretical modelling and practical solutions to the test by experimenting with different multi-robot domains both with real robots and in a simulation. }, }