PLDI 2023

Towards Supporting Universal Static Analysis using WALA

View publication


In a modern cloud native application development environment, developers make use of a variety of programming languages to build their applications, e.g., they may use Javascript for frontend, Python for data science, Golang for cloud orchestration, etc. In this scenario, it is critical to perform a holistic analysis of the application across all these languages and frameworks to identify potential problems that may arise such as a) performance issues (such as latency, throughput, and response times), b) security concerns (such as resource leak, vulnerabilities, and more), or c) stability and resiliency (such as being able to handle fault and exceptional inputs correctly). Many of these tasks necessitate the use of static analysis to preemptively discover and address the potential problems. However, modern static analysis tools offer limited support for languages beyond the most commonly used languages i.e., Java, C, C++. Writing a static analysis framework for a new language is a complex and time consuming process. One solution can be to convert different program languages to a common representation and perform the static analysis on top of that. For instance, the Watson Libraries for Analysis (WALA) framework enables one to write different analysis algorithms on a SSA (Single Static Assignment) based intermediate representation (IR). Due to that, one can write the analysis algorithm once and should be applicable towards different programming languages. However, in current state, WALA can convert programs written in Java, Javascript, and Python to its corresponding IR representations. In this tutorial, we will discuss how one can extend WALA to support a new programming language. For that, we will be working with a small language named Racket, which is widely used in academia. The choice of the language is due to its simplicity and having different PL features i.e., higher-order, first class functions, objects, etc., which resembles many modern languages. We will start with the basic understanding of the Racket language. Then, we will briefly discuss the SSA-based IR. Finally, we will go through various steps one need to follow to generate the IR. We will use that IR to create different static analysis algorithms.