Functional analysis of code


Enterprises often take up modernization activities to provide better user experience, ease developers efforts in maintenance and effectively leverage cloud advantages. But modernizing legacy applications tend to be challenging due to lack of appropriate skills, lack of adequate documentation and technical debts gathered over years of feature development and maintenance. Therefore we at IBM Research Labs are building technologies that extract application implementation structure in the form of entities and relations through static and dynamic analysis, and represent them in a knowledge graph with a proprietary schema. The knowledge graph then can be readily used by AI based tools and custom applications for many downstream tasks.

Incremental Code Analysis

For modernization or even maintenance of legacy applications, discovery is an important first step. Discovery means understanding the application and relations amongst its various components such as programs, database tables, files, jobs, transactions etc. The lack of documentation and documentation getting obsolete over the time is generally an issue in software life cycle but its even more so in case of really old mainframe applications. That is because most of those who designed and developed these applications have retired, moved on or otherwise not available. Furthermore, very few people are available in the market who have skills in these programming languages and technologies. Therefore, automating the discovery phase and providing tools support for automatcally analyzing the source code is very important. Therefore, we are building scalable source code analysis algorithms for program flow and slicing; and interactive user interface which gives design studio like experience. The system highlights the dependencies from the newly identified boundary(slices) to/from the other logical groups of the system. This exercise is repeated interactively to 1) Identify the application entities and groupings of interest for the modernization task, and 2) Understand how a change in one part of the system may affect the other parts.

Microservices Recommendations

Increasingly, Enterprises want to refactor their applications to adopt microservices architecture as part of their journey to cloud. This requires mapping business functions onto the code structure to enable microservice partitions that align with business functions. But identifying functional boundaries on the existing code is a tedious and time consuming task. We view this traditional software decomposition problem as a candidate for AI based solutioning. With the amazing set of Program Analysis and Artifical Intelligence experts, we tackle this problem in a 4 step process: 1) Perform static & dynamic code analysis to understand the implementation structure of the application, 2) Represent the application entities and their relationships in a property Knowledge Graph with proprietary schema, 3) Identify Microservices by modelling the functional boundary detection as a clustering task, and 4) Refactor code to enable cross microservices communication and distributed transaction and introduce supporting artefacts to ease deployment efforts.

Our group of AI4Code experts view the software engineering decomposition task as a graph clustering task. We represent the application implementation details into a heterogeneous graph with programs, tables, files etc. as nodes, their relationships/interactions as edges or node/edge attributes and translate them as features to any AI based models for better representation learning and clustering.

Overview of Microservices Generation Approach