Enhancing Federated Averaging of Self-Supervised Monocular Depth Estimators for Autonomous Vehicles with Bayesian Optimization
Abstract
Recent research in computer vision for intelligent transportation systems has prominently focused on image-based depth estimation due to its cost-effectiveness and versatile applications. Monocular depth estimation methods, in particular, have gained attention for their reliance on a single camera, offering high versatility compared to binocular techniques requiring two fixed cameras. While advanced approaches leverage self-supervised deep neural network learning with proxy tasks like pose estimation and semantic segmentation, some overlook crucial requirements for real autonomous vehicle deployment. These include data privacy, reduced network consumption, distributed computational cost, and resilience to connectivity issues. Recent studies highlight the effectiveness of federated learning combined with Bayesian optimization in addressing these requirements without compromising model efficacy. Thus, we introduce BOFedSCDepth, a novel method integrating Bayesian optimization, federated learning, and deep self-supervision to train monocular depth estimators with better efficacy and efficiency than the state-of-the-art method on self-supervised federated learning. Evaluation experiments on KITTI and DDAD datasets demonstrate the superiority of our approach, achieving up to 40.1% test loss improvement over the baseline at the initial rounds of training with up to 33.3% communication cost reduction, linear computational cost overhead at the central server and no overhead at the autonomous vehicles.