About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ISCA 2023
Workshop paper
Object Detection and Classification on a Heterogeneous Edge SoC
Abstract
With the recent expansion of Large Language Model (LLM) capabilities, there exists new potential for improving the performance of object detection and classification tasks by taking advantage of the Vision Transformer (ViT) architecture. In this paper, we focus specifically on the problem of object detection and classification on the edge, via a heterogeneous System-on-Chip (SoC). Unique constraints arise in an edge setting, most notably in the amount of available memory - a difficult task given the incredible size of LLMs. Our exploration begins with a traditional Convolutional Neural Network (CNN), running on a small deep learning accelerator, and the issues we faced with this approach on a heterogeneous edge SoC. We transition to a transformer-based architecture, using a ViT adapted for simultaneous object detection and classification running on a Natural Language Processing (NLP) accelerator. In particular, we focus on increasing sparsity in the model to combat the strict memory constraints of the chip and introduction of early-exit mechanisms to minimize end-to-end latency.