Publication
ASPLOS 2024
Conference paper

MulBERRY: Enabling Bit-Error Robustness for Energy-Efficient Multi-Agent Autonomous Systems

Abstract

The adoption of autonomous swarms, consisting of a multi- tude of unmanned aerial vehicles (UAVs), operating in a col- laborative manner, has become prevalent in mainstream appli- cation domains for both military and civilian purposes. These swarms are expected to collaboratively carry out navigation tasks and employ complex reinforcement learning (RL) models within stringent onboard size-weight-and-power (SWaP) con- straints. While techniques such as reducing onboard operating voltage can improve the energy efficiency of both computation and flight missions, they can lead to on-chip bit failures that are detrimental to mission safety and performance. To this end, we propose MulBERRY, a multi-agent robust learning framework to enhance bit error robustness and energy efficiency for autonomous swarm systems. MulBERRY sup- ports multi-agent robust learning, both offline and on-device, with adaptive and collaborative agent-server optimizations. For the first time, MulBERRY demonstrates the practicality of robust low-voltage operation on multi-UAV systems lead- ing to energy savings in both compute and mission quality- of-flight. We conduct extensive system-level multi-UAV au- tonomous navigation experiments with algorithm-level robust learning and hardware-level bit error, thermal and power characterizations, demonstrating that MulBERRY achieves robustness-performance-efficiency co-optimizations. We also show that MulBERRY generalizes well across fault patterns, environments, UAV types, and RL policies, with up to 18.97% reduction in flight energy, 22.07% increase in the number of successful missions, and 4.16× processing energy reduction.