Rational Swarms | Gal A. Kaminka: Professor\* \*and more.

A Puzzle: How do individuals make decisions in cooperative swarms?

The key characteristic of swarms is locality of perception: The individual robot can only sense in limited range, and interact with a few others around it. In cooperative swarms, where all the members of the swarm share a joint goal, this raises an interesting puzzle:

How can the single robot choose actions that help the entire swarm, if it cannot know how these actions affect the entire swarm?

We are addressing this puzzle in two lines of research: distributed multi-agent reinforcement learning, and vision-based swarming. See details below.

This is on-going research: we are looking for students and post-docs

Distributed multi-agent reinforcement learning

We use distributed, multi-agent reinforcement learning (MARL), focusing on algorithms and reward functions that are grounded in the individual robot’s perception. The reward functions we develop do not use any public signals or information that would not be available to a physical robot. In MARL terms, we are dealing with Independent Learners.

Rational swarms: Align individuals with their swarm goals

Our focus is on learning algorithms and reward functions that align the goals of the individual with the goals of the swarm. This allows us to examine swarms from the perspective of cooperative game theory.

The following paper, published in the Royal Society Transactions best summarizes the approach and results up to 2025:

Gal A. Kaminka, Swarms can be rational. In Philosophical Transactions of the Royal Society A (2025)

We have shown that each individual robot can measure its own time and resources spent on coordination, as opposed to working on the task itself. This investment in coordination overhead results from the coordination algorithm taken, and is minimized by the RL process¹. However, this can also cause robots to take actions that help themselves, but hurt the swarm (think of the Prisoner’s Dilemma game). We therefore investigate methods for aligning the individual reward with that of the collective².

Distributed MARL leads to Heterogeneous Swarms

Different robots learn to respond differently to collisions and inter-agent conflicts that require coordination. They become heterogeneous in their decision-making. We have experimented and witnessed this in many environments, with both physical robots and simulated robots (see environments gallery page for videos and media). Please see the papers for details¹,². You may also be interested to learn about how the rational swarm model was developed and evolved through the years.

Vision-Based Swarming

Each individual robot makes decisions (and computes rewards) based on what it perceives.
Much of the work in robot swarms has assumed that basic perceptual features are given:

Distinguishing other robots (neighbours) from static objects
Distance to neighbours, the neighbours’ orientation
The identification of the neighbour as belonging to the same swarm (recognizing conspecifics)

When using vision for sensing (i.e., cameras) it is very challenging to provide these features. Together with Amir Ayali’s lab at Tel Aviv University, we are investigating how robots and animals (locust) can overcome these challenges, and use vision to move together and swarm.

Read below for more details and pointers to papers. See the media gallery for movies.

Occlusions

A robot (or locust) in a dense swarm is often occluded from view. When it is completely occluded, it is not visible at all, but that’s the easy case. The challenge is what to do when it is only partially visible. We are investigating several models for how the part that is visible can be identified and processed³. They differ by the amount of computation required to carry them out. Naturally, we are interested in cheap methods: robots and locusts do not have too many neurons (or CPU cycles) to spare.

Range from a visual sensor

One of the most obvious challenges is how the distance (range) to each neighbour can be computed. When a robot examines an image of a neighbour, it does not immediately know how far it is, without investigating additional computation. There are multiple interpretations to what is being perceived³,⁴.

We have shown that by combining information from the angular width and angular height of a neighbour, the robot can achieve high-accuracy estimate of the range⁴. In experiments with locust, it seems such a combination is a very good predictor of locust response to visual stimulus⁵.

Pause to observe, move to act:

A well known familiar in locust is that each individual applies intermittent locomotion, also known as pause-and-go. Each locust moves for a bit, then stops for a bit, and then again. There are various hypotheses for why this occurs in nature, but in robots, we found a good answer.

Robots using intermittent motion are able to identify neighbours that are too slow, or even stuck in place⁶,⁴. Without this ability, the faulty robots become anchors, causing the nominal (healthy) robots to slow down or move in place. Intermittent locomotion thus becomes a very cheap (computationally) method for determining members of the swarm that should be followed.

Gal A. Kaminka, Dan Erusalimchik, and Sarit Kraus. Adaptive Multi-Robot Coordination: A Game-Theoretic Perspective, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2010. ↩︎ ↩︎
Gal A. Kaminka, Swarms can be rational. In Philosophical Transactions of the Royal Society A (2025) ↩︎ ↩︎
David L. Krongauz, Amir Ayali, and Gal A. Kaminka. Vision-Based Collective Motion: A Locust-Inspired Reductionist Model. PLOS Computational Biology, 20(1):e1011796, 2024 ↩︎ ↩︎
Shefi P., Ayali A. and Kaminka G.A. (2025) Bugs with features: resilient collective motion inspired by nature. Under Review. ↩︎ ↩︎ ↩︎
Itay Bleichman, Peleg Shefi, Gal A. Kaminka, and Amir Ayali. The visual stimuli attributes instrumental for collective-motion-related decision-making in locusts. PNAS Nexus, 5(3):pgae537, Oxford University Press, 2024. ↩︎
Peleg Shefi, Amir Ayali, and Gal A. Kaminka. Pausing Makes Perfect: Intermittent Pauses for Resilient Swarming. In International Symposium on Distributed Autonomous Robotic Systems (DARS), Springer, 2024. ↩︎