Towards Reliable AI for Source Code Understanding

Sahil Suneja; Yunhui Zheng; Yufan Zhuang; Jim A. Laredo; Alessandro Morari

doi:10.1145/3472883.3486995

SoCC 2021

Conference paper

01 Nov 2021

Towards Reliable AI for Source Code Understanding

View publication

Abstract

Cloud maturity and popularity have resulted in Open source software (OSS) proliferation. And, in turn, managing OSS code quality has become critical in ensuring sustainable Cloud growth. On this front, AI modeling has gained popularity in source code understanding tasks, promoted by the ready availability of large open codebases. However, we have been observing certain peculiarities with these black-boxes, motivating a call for their reliability to be verified before offsetting traditional code analysis. In this work, we highlight and organize different reliability issues affecting AI-for-code into three stages of an AI pipeline- data collection, model training, and prediction analysis. We highlight the need for concerted efforts from the research community to ensure credibility, accountability, and traceability for AI-for-code. For each stage, we discuss unique opportunities afforded by the source code and software engineering setting to improve AI reliability.

Conference paper