Gal A. Kaminka: Publications

Sorted by DateClassified by Publication TypeClassified by TopicGrouped by Student (current)Grouped by Former Students

Heterogeneous Foraging Swarms Can be Better

Gal A. Kaminka and Yinon Douchan. Heterogeneous Foraging Swarms Can be Better. Frontiers in Robotics and AI, 11(1426282), 2025.

Download

[PDF]12.1MB  [gzipped postscript] [postscript] [HTML] 

Abstract

Inspired by natural phenomena, generations of researchers have been investigating howa swarm of robots can act coherently and purposefully, when individual robots can only sense and communicate with nearby peers,with no means of global communications and coordination. In this paper,we will show that swarms can perform better, when they self-adapt to admit heterogeneous behavior roles.We model a foraging swarm task as an extensive-form fully-cooperative game, in which the swarm reward is an additive function of individual contributions (the sum of collected items). To maximize the swarm reward,previous work proposed using distributed reinforcement learning, where each robot adapts its own collision-avoidance decisions based on the Effectiveness Index reward (EI). EI uses information about the time between their own collisions (information readily availableeven to simple physical robots). While promising, the use of EI is brittle (as we show), sincerobots that selfishly seek to optimize their own EI (minimizing time spent on collisions) can actually cause swarm-wide performance to degrade. To address this, we derive a reward function from a game-theoretic view of swarm foraging as a fully-cooperative, unknown horizon repeating game. We demonstrate analytically that the total coordination overhead of the swarm (total time spent on collision-avoidance, rather than foraging per-se) is directly tied to the total utility of the swarm: less overhead, more items collected. Treating every collision as a stage in therepeating game, the overhead is bounded by the total EI of all robots. We then use a marginal-contribution (difference-reward) formulationto derive individual rewards from the total EI. The resulting Aligned Effective Index ($ÆI$) reward has the property that each individual canestimate the impact of its decisions on the swarm: individual improvements translate to swarm improvements. We show that $ÆI$ provably generalizes previous work, adding a component that computes the effect of counterfactual robot absence. Differentassumptions on this counterfactual lead to bounds on $ÆI$ from above and below. While the theoretical analysis clarifies both assumptions and gaps with respect to the reality ofrobots, experiments with real and simulated robots empirically demonstrate the efficacy of the approach in practice, and the importance of behavioral (decision-making) diversityin optimizing swarm goals.

BibTeX

@article{frontiers25,
  author = {Gal A. Kaminka and Yinon Douchan},
  title = {Heterogeneous Foraging Swarms Can be Better},
  year = {2025},
  journal = {Frontiers in Robotics and AI},
  volume = {11},
   OPTurl = {https://doi.org/10.3389/frobt.2024.1426282},
  doi = {10.3389/frobt.2024.1426282},
  number = {1426282},
  OPTwwwnote = {},
  abstract = {Inspired by natural phenomena, generations of researchers have been investigating how
a swarm of robots can act coherently and purposefully, when individual robots can only sense and communicate with nearby peers,
with no means of global communications and coordination.  In this paper,
we will show that swarms can perform better, when they self-adapt to admit heterogeneous 
behavior roles.
We model a foraging swarm task as an extensive-form fully-cooperative game, in which the swarm reward 
is an additive function of individual contributions (the sum of collected items).  To maximize the swarm reward,
previous work proposed using distributed reinforcement learning, where each robot adapts its own collision-avoidance decisions based on the \emph{Effectiveness Index} reward (\emph{EI}). \emph{EI} uses information about the time between their own collisions (information readily available
even to simple physical robots). While promising, the use of \emph{EI} is brittle (as we show), since
robots that selfishly seek to optimize their own \emph{EI} (minimizing time spent on collisions) can actually cause swarm-wide performance to degrade. 
To address this, we derive a reward function from a game-theoretic view of swarm foraging as a fully-cooperative, unknown horizon repeating game. 
We demonstrate analytically that the total coordination overhead of the swarm (total time spent on collision-avoidance, rather than foraging per-se) 
is directly tied to the total utility of the swarm:  less overhead, more items collected.  Treating every collision as a stage in the
repeating game, the overhead is bounded by the total EI of all robots. We then use a marginal-contribution (difference-reward) formulation
to derive individual rewards from the total \emph{EI}. The resulting \emph{Aligned} Effective Index ($\AEI$) reward has the property that each individual can
estimate the impact of its decisions on the swarm:  individual improvements translate to swarm improvements. 
We show that $\AEI$ provably generalizes previous work, adding a component that computes the effect of counterfactual robot absence. Different
assumptions on this counterfactual lead to bounds on $\AEI$ from above and below. 
While the theoretical analysis clarifies both assumptions and gaps with respect to the reality of
robots, experiments with real and simulated robots empirically 
demonstrate the efficacy of the approach in practice, and the importance of behavioral (decision-making) diversity
in optimizing swarm goals.
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Mon Feb 03, 2025 16:33:37