Congestion-aware decentralized multi-robot navigation in continuous 2D
FlowSwarm implements Optimal Reciprocal Collision Avoidance (ORCA) from scratch, reproduces its documented deadlock failure modes, and adds a decentralized coordination toolkit that resolves them — then measures the improvement across a benchmark of stress scenarios. Each robot plans globally with A* and avoids collisions locally with ORCA, using only locally-sensed information (no central controller, no all-to-all communication).
Left: 20 robots swap to antipodal points — a roundabout emerges. Middle: two-way hallway turn-taking. Right: a funnelled bottleneck.
Reciprocal collision avoidance such as ORCA is provably safe only when a collision-free velocity exists for every agent simultaneously. In dense traffic that assumption breaks down: the swarm jams at choke points and deadlocks in symmetric standoffs — a well-known open problem in decentralized multi-robot navigation.
FlowSwarm pairs a from-scratch ORCA + A* stack with a toolkit of decentralized coordination primitives, each targeting a specific failure mode, and evaluates them on a reproducible benchmark backed by a subsystem validation suite. The development notes record the design rationale, including approaches that were evaluated and discarded — for example, a reactive cost-field rerouting scheme that measurably degraded throughput and was replaced by the flow-field formulation.
Baseline (plain ORCA + A*) vs FlowSwarm, mean over 3 seeds.
Reproduce with python experiments/benchmark.py.
| scenario | failure mode | success (base → flow) | collision-steps (base → flow) |
|---|---|---|---|
| antipodal | symmetric convergence | 0.00 → 1.00 | 3329 → 0 |
| corridor | narrow two-way passage | 0.92 → 1.00 | 188 → 25 |
| slalom | staggered S-curve gaps | 0.94 → 1.00 | 232 → 159 |
| bottleneck | one-way funnel | 0.97 → 1.00 | 517 → 199 |
| twin_gap | route choice | 0.98 → 1.00 | 88 → 88 |
| evacuation | single-exit clogging | 0.99 → 1.00 | 69 → 71 |
| dense_random | scalability, no structure | 1.00 → 1.00 | 3 → 3 |
| intersection | 4-way junction | 1.00 → 1.00 | 15 → 12 |
| crossing | perpendicular streams | 1.00 → 1.00 | 0 → 0 |
| warehouse | pillared bidirectional | 1.00 → 1.00 | 12 → 10 |
FlowSwarm reaches 100% success on every benchmark scenario (baseline ranges 0.00–1.00) and cuts collisions far below baseline — most dramatically on antipodal (3329 → 0) and at funnels (517 → 199). Two ideas close the gaps the baseline can't: an emergent roundabout resolves the symmetric deadlock (antipodal 0 → 1.00), and park-and- yield at goals breaks the MAPF goal-blocking deadlock that otherwise strands a few robots. Residual collisions are sub-centimetre grazes in dense packing, not failures (see the validation report).
Headlines:
- antipodal (20 robots → opposite sides): 0% → 100%, collisions 3329 → 0, via an emergent roundabout — the one case the baseline can never solve.
- corridor (two rooms, one 1-lane hallway): 92% → 100%, 188 → 25 collisions, via decentralized turn-taking.
- Goal-blocking deadlock fixed: reached robots park-and-yield instead of freezing → the last stragglers (bottleneck/evacuation/twin_gap) all complete.
- Every benchmark scenario reaches 100% success, with collisions far below baseline throughout.
World (continuous arena + disc obstacles)
└─ Robot (holonomic disc: pos, goal, path, vel)
│ per tick, per robot, fully decentralized:
▼
A* global plan → preferred velocity → COORDINATOR → ORCA → integrate
(the toolkit) (safe v)
The coordinator is the only seam between planning and control; with none, you get the plain-ORCA baseline. FlowSwarm's coordinator is a small toolkit:
| Mechanism | Targets | Idea (decentralized) |
|---|---|---|
| Flow-field lanes | opposing streams | robots deposit their velocity in a shared decaying grid (stigmergy); sensing oncoming flow ahead, they bias sideways → lanes self-organize |
| Roundabout recovery | symmetric gridlock (antipodal) | a stuck robot whose whole neighbourhood is gridlocked circulates tangentially around the cluster centroid → a self-clearing swirl |
| Passage reservation | narrow two-way channels (corridor) | detect a long tight channel from the map; read occupancy from the flow field along the channel; a fixed right-of-way axis + cold-start hold gives one-way-at-a-time turn-taking (with a starvation override for livelock-freedom) |
Everything uses only quantities a robot could physically sense or store locally — the shared grid is a stigmergic medium (ant-pheromone style), not a global oracle.
git clone https://github.com/Manas-arumalla/flowswarm && cd flowswarm
python -m pip install -e ".[dev]"
# run a scenario and save an animation
python -m flowswarm.cli run --scenario antipodal --planner flowswarm --save runs/antipodal.gif
python -m flowswarm.cli run --scenario corridor --planner baseline --save runs/corridor_base.gif
# list scenarios / planners
python -m flowswarm.cli list
# reproduce the benchmark (table + plots under assets/)
python experiments/benchmark.py
# run the tests
python -m pytestfrom flowswarm.runner import run_scenario
sim = run_scenario("corridor", planner="flowswarm", seed=0)
print(sim.metrics.summary())Eleven maps, each isolating a different failure mode of decentralized navigation:
antipodal (symmetric convergence) · crossing (perpendicular streams) ·
bottleneck (one-way funnel) · twin_gap (route choice) · corridor (narrow
two-way passage) · warehouse (pillared bidirectional) · intersection (4-way
junction) · evacuation (single-exit clogging / faster-is-slower) · slalom
(staggered S-curve gaps) · dense_random (scalability, no structure) ·
pedestrians (non-cooperative moving agents the robots must avoid).
Can a single learned policy replace the hand-engineered toolkit? A message-passing GNN (each robot a node, neighbours as edges, body-frame features, predictive forward-propagated congestion, output a velocity residual on top of ORCA — so it only learns coordination, never raw avoidance), warm-started by behaviour cloning and fine-tuned with PPO (clipped surrogate + actor-critic value head + GAE). Result — it rivals the toolkit from local sensing alone:
Evaluated on the same scenarios, seeds, and step budget as the main benchmark (so the baseline and toolkit columns match the table above):
| scenario | baseline | PPO-GNN (learned) | toolkit |
|---|---|---|---|
| antipodal | 0.00 / 3329 | 1.00 / 0 | 1.00 / 0 |
| corridor | 0.92 / 188 | 1.00 / 57 | 1.00 / 25 |
| warehouse | 1.00 / 12 | 1.00 / 6 | 1.00 / 10 |
| bottleneck | 0.97 / 517 | 1.00 / 175 | 1.00 / 199 |
| slalom | 0.94 / 232 | 0.92 / 129 | 1.00 / 159 |
| crossing | 1.00 / 0 | 1.00 / 0 | 1.00 / 0 |
Predictive PPO matches the toolkit's 100% success on five of the six scenarios — including the one-way bottleneck, where it is even more collision-efficient than the toolkit (175 vs 199) — and learns the antipodal roundabout and corridor turn-taking purely from local observations. It is also more collision-efficient than the toolkit on warehouse (6 vs 10) and slalom (129 vs 159). The one place it trails is the slalom success rate (0.92 vs 1.00), where the hand-engineered escape-replan still wins — a clear example of where explicit engineering retains an edge over the learned policy. The development notes record the negative results as well: plain REINFORCE (no value baseline, no predictive features) did not beat behaviour cloning, and three hand-tuned anti-clogging controllers were evaluated and discarded once profiling traced the apparent "clogging" to a goal-packing issue in the scenario itself (correcting it raised evacuation from 0.19 to 0.92).
It generalizes zero-shot. Trained on seven scenarios, the same policy — with no
retraining — reaches 100% success on four held-out scenarios it never saw
(antipodal, evacuation, dense_random, and pedestrians with moving non-cooperative
agents), matching the engineered toolkit, and holds 100% from 16 up to 48 robots
(2.4× its training size) — the per-node graph policy is size-invariant.
pip install -e ".[learn]" # optional torch extra (core stays numpy-only)
python experiments/train_gnn.py # BC train + evaluate the GNN
python experiments/train_learned.py # the pure-numpy linear baseline
python experiments/generalization.py # zero-shot transfer + size-scaling studyThe roadmap (MLP → GNN → MARL with anticipatory congestion prediction) is in
docs/ROADMAP.md.
Four research-grade extensions, each with its own measured result:
Control Barrier Function (CBF) safety filter (SimConfig(safety="cbf")). An
alternative to ORCA with a formal guarantee: enforcing ḣ ≥ −γh on the barrier
h = ‖Δp‖² − R² keeps the safe set forward-invariant. The CBF condition is a
half-plane in velocity space, so it reuses the same LP solver. With a discretization
margin it maintains a positive safety buffer where reactive ORCA grazes to zero
(crossing +0.078 m, antipodal +0.170 m) — at the cost of more conservatism in
dense funnels: a textbook safety-vs-liveness tradeoff.
Optimal task assignment + lifelong operation (flowswarm/planning/assignment.py,
flowswarm/sim/lifelong.py). A Hungarian-algorithm allocator assigns robots to
goals minimising total distance, and a continuous re-tasking manager hands out a new
goal on every arrival — the realistic warehouse problem, measured by sustained
throughput (experiments/lifelong.py). Turns the demo into a full
assign → plan → coordinate → execute system.
Decentralized sampling-MPC (--planner mpc). A receding-horizon planner: each
robot samples candidate velocities, rolls each out predicting neighbours, and picks
the best by goal-progress minus anticipated collisions (ORCA still the safety
filter). Matches/beats ORCA in open space (warehouse collisions 15 → 0); its known
limitation is that a sampling preferred-velocity generator is unsuited to tight
single-file funnels, where the coordination toolkit is used instead.
3D extension (python -m flowswarm.threed --scenario cube_swap). Quadrotor-style
agents in a volume, with the CBF condition generalised to ℝ³ and solved by iterative
projection. Four 3D scenarios (sphere antipodal, random cube, two perpendicular
streams, cube swap) are all solved collision-free (min-sep ≥ +0.07 m).
Real-physics showcase (MuJoCo). FlowSwarm also drives a rigid-body MuJoCo world: each robot is a cylinder with mass on planar joints, velocity-servoed to the command FlowSwarm computes, with robot–robot contacts resolved by real physics. ORCA supplies the intentions; MuJoCo supplies the consequences — and the emergent roundabout still resolves the antipodal swap under real dynamics (100% success).
pip install -e ".[sim]" # optional MuJoCo extra
python -m flowswarm.mujoco_sim --scenario antipodal --planner flowswarm --gif out.gifReal-robot kinematics. Runs on a differential-drive (unicycle) model — robots
drive and turn, they can't slide — with NH-ORCA radius inflation preserving safety
(SimConfig(kinematics="diffdrive")). Optional sensor-noise model for robustness
studies. Removes the "holonomic toy" assumption.
Left: differential-drive robots (note the heading ticks — they turn, not slide). Right: the swarm flows around non-cooperative "pedestrians" (red) that don't avoid it.
Dynamic, non-cooperative obstacles. Moving agents (e.g. pedestrians/intruders)
that follow their own path and do not avoid the swarm; robots take full ORCA
responsibility for them and keep clear (pedestrians scenario). A realistic stressor
— real robots share space with agents that won't cooperate.
Robust to imperfect sensing. Real robots don't have perfect neighbor state, so the
sensing model is degraded two ways — position noise and observation latency
(robots react to delayed neighbor state) — and the toolkit re-measured. Position noise
is tolerated well (success ≥0.94 even at 0.3 m, ≈75% of a robot radius), and the
reactive scenarios hold 100% through 1.2 s of delay. The timing-based passage
reservation is the exception: it desynchronizes at small latency (corridor dips to
0.42 at 0.2 s, recovering by 0.8 s), so latency — not noise — is the binding
constraint. Reproduce: python experiments/sensing_robustness.py.
Interactive real-time demo. python -m flowswarm.cli demo — watch the swarm live,
click to add robots / obstacles, switch planner & kinematics on the fly, toggle the
flow-field overlay and trails. (pip install -e ".[demo]")
Scales to hundreds of robots at interactive rates, with a scalability study:
Traffic-science fundamental diagram — sweeping density reproduces the classic speed-density collapse; FlowSwarm shifts the flow curve up in the congested regime:
Statistical rigor. Headline gains come with 95% confidence intervals over
multiple seeds (experiments/stats.py), so claims aren't single-seed anecdotes:
Config-driven experiments. Declare a sweep in YAML and run it:
python -m flowswarm.cli exp experiments/configs/quick.yaml.
Technical report. A figure-rich mini-paper: REPORT.md. Docs site:
pip install -e ".[docs]" && mkdocs serve.
- From-scratch ORCA — a faithful port of the RVO2 linear-program solver, validated in isolation before anything was built on top of it.
- 130 tests covering geometry, A* edge cases (no-path / start-in-obstacle / start==goal), ORCA numerical robustness (coincident agents, speed limits), determinism, no-collision invariants, parameter robustness, and a feature-combination sweep (scenario × planner × kinematics × safety filter).
- A systematic subsystem validation report that records the assumptions, failure modes, and numerical-stability checks behind each component, including bugs caught and fixed (channel false-positives, threshold coupling, reservation livelock).
- Deterministic and seeded throughout;
ruff-clean and wired into CI; reproducible benchmark, scaling, and statistical-rigor harnesses.
- No universal deadlock solver. Each mechanism targets a specific failure mode; a provably-general decentralized solver remains an open research problem.
- Reservation priority is a fixed right-of-way convention with a bounded-wait fairness guard, rather than a negotiated optimum.
- Sampling-MPC does not handle tight funnels as a standalone planner; the coordination toolkit is used there instead.
- Explicit-Euler integration at 0.1 s. Adequate for the disc/unicycle models used here; faster agents or finer geometry would call for a smaller step or a higher-order integrator.
- The learned policy rivals but does not yet fully surpass the hand-engineered toolkit — it matches 100% success on five of six scenarios but trails on the staggered slalom (0.92 vs 1.00); the roadmap (GNN → multi-agent RL with anticipatory congestion prediction) targets that gap.
- J. van den Berg, S. J. Guy, M. Lin, D. Manocha. Reciprocal n-body Collision Avoidance. ISRR 2011. [pdf]
- Congestion-metric replanning for dense multi-agent navigation. [arXiv:2202.11334]
- VR-ORCA: variable-responsibility ORCA · LivePoint / Merry-Go-Round: deadlock-free decentralized control (background on the open problem).
- D. Helbing, P. Molnár. Social force model for pedestrian dynamics (lane self-organization).
MIT — see LICENSE.













