Context-Aware Fault Classification for Multi-Access Edge Computing: A Formal Methods Perspective
Abstract
Multi-Access Edge Computing (MEC) is increasingly being adopted as the de facto enabler for ultra-low latency access to application services. By placing application services on MEC servers situated in proximity to end users, MEC avoids the large network latencies frequently experienced while accessing cloud services. Workloads can then either be executed locally on the devices or offloaded to the MEC servers. MEC is envisioned as the fundamental enabler for a number of ultra-low latency safety-critical systems, including data inferencing for autonomous vehicles amongst others. The MEC paradigm is, however, highly susceptible to various types of faults such as MEC server downtime, communication link faults, hardware faults and so on owing to the heterogeneity of hardware configurations and diverse geographies of operations. For real-time and safety-critical workloads, averting the impact of faults is a key facet. To address this challenge, we propose a fault classification policy for MEC that categorizes a fault as critical or non-critical by leveraging Probabilistic Model Checking, a Formal Methods technique, to ensure probabilistic guarantees with respect to a specified failure context. We present experimental results on a real-world datasets to show the effectiveness of our approach.