Bringing AI to the command line
For decades, developers and researchers have been using the command line interface (CLI) to build, execute, and deploy the software that runs the world around us. Users have come to love, hate and, eventually, embrace the unique, idiosyncratic, and sometimes antiquated challenges associated with using the terminal shell; and have adapted their behaviors and usage patterns around these challenges. These challenges include the very steep learning curve on CLIs; the need to know and remember complex commands and their usage in specific instances; and a lack of troubleshooting help when users run into problems on the CLI.
However, with the arrival of cloud-based ecosystems and cloud-native applications, as well as the scalable real-world deployment of artificial intelligence (AI) systems based on machine learning (ML) and natural language processing (NLP), we are at an inflection point akin to the initial emergence of large-scale networked terminals. This is an opportune moment to transform the CLI user experience, and imbue the command line with the power of AI.
Video: http://ibm.biz/clai-video
Website: https://github.com/IBM/clai
This blog describes IBM’s open-source Project CLAI (Command Line AI), a research and development platform that seeks to bring the power of AI to the command line. Using CLAI, users of the (bash) terminal shell can access a wide range of AI-backed skills that seek to enhance their command line experience. Project CLAI is intended to rekindle the spirit of AI softbots by providing a plug-and-play framework and simple interface abstractions to the bash and its underlying operating system. Developers can access the command line through a simple sense-act API for rapid prototyping of newer and more complex AI capabilities.
- End users of the command line interface (e.g. bash): These are people who use the command line to develop, execute, or monitor code in the course of life in a Unix-based operating system; and
- AI researchers and developers who want to develop skills or plugins that can improve the life of the end user on the command line.
There may be some overlap between the two types of users — many developers may also be consuming other skills as end users. In order to provide a uniform experience across the different skills offered (many of which will also be contributed by the community) on CLAI, we provide a standardization of interaction paradigms for the two user types. For end users, these consist of generic task types that they seek to undertake on the command line, such as automation, support and troubleshooting (proactive), natural language support, and pedagogy (learning). For researchers and AI developers, we standardize the available API support into listen & learn, react, simulate, and orchestrate features. All skills that are developed as part of the CLAI platform must make use of these API paradigms, and build on the interaction paradigms previously listed. We delve into some specific use-cases as instantiations of the interaction paradigms — and the skills provided to support — below.
Skills form the backbone of the CLAI platform — a skill is an adaptation of one or more AI techniques to solve a specific use-case for support on the command line interface. Skills provide a way for state-of-the-art AI techniques and algorithms to be harnessed immediately into the CLAI platform. The default installation of CLAI comes with some built-in skills, which we describe below in order to familiarize the reader with the notion of skills. These are illustrations of experimental skills that help AI researchers get started with the CLAI API.
Use-case Type: Natural Language Support
API Features: Listen & Learn; React
This skill lets the user specify tasks using a natural language (English) description, and retrieve the corresponding command line syntax. The intended use-case of this skill is to save users the time required to look up complex command syntaxes and attendant options and flags. We chose two use-cases as initial illustrations of this skill: (1) compressing and uncompressing of archives using the “tar” command; and (2) looking for strings in files using the “grep” command. Such commands happen to be among the most commonly used Bash utilities.
In terms of this skill’s implementation, the natural language invocation typed in by the user is passed through a natural language classifier on watsonx Assistant. If there is a significant match with the known syntax patterns of tar and grep, the user command is translated into the corresponding command line syntax here.
The main value of this skill is illustrative; it can be made as accurate as needed for these specific use cases. However, this approach does not scale to Bash commands in general. We have thus designed a challenge around this problem, which we describe later in this post.
Use-case Type: Automation
API Features: Listen & Learn; React; Simulate; Orchestrate
This skill provides an example of automation and support for the deployment pipeline of applications to the IBM Cloud. The automation is done by integrating the output of an automated planner into a CLAI skill. The domain knowledge required for this automation is currently specified manually, but the CLAI framework can accommodate observation of user command traces on the CLI to allow learning of this knowledge over time.
Use-case Type: Proactive Troubleshooting
API Features: Listen & Learn; React
The final skill that we present is the “Fixit” skill — this skill fixes the last command typed by the user on the command line as per the rules of an external plugin. The fixit skill (along with the man page explorer skill which uses tldr internally) serves as a simple illustration of how to fold in existing plugins and tools into the CLAI framework, showing the ease of extensibility. The skill responds whenever there is an error in response to a command typed by a user. When an error is detected, the command text and the error message are passed to the external plugin, in order to get a corrected command in response. This corrected command is then suggested to the user. Below, we present an example demonstration of this plugin’s use:
Use-case Type: Natural Language Support
API Features: Listen & Learn; React
This skill allows the user to ask Bash for relevant commands by describing the task in natural language. The skill utilizes the knowledge available in “man” (short for “manual”) pages in Linux and MacOS platforms: man pages are user-friendly, readable system manual pages that describe a specific command and all of its potential usages, along with the relevant syntax. In this specific skill, the command whose man page document has the highest text match with the user’s natural language query is suggested by the skill as a relevant command. In contrast to the previous nlc2cmd skill, the scope of this skill is much broader (it covers all commands that have a man page, which is a significant percentage of all possible Bash commands). However, this skill is shallower in scope, and does not map out specific user intents to specific flags and syntax features, etc. Along with the suggested command, we add a summary of the corresponding man page using an external call to the tldr plugin.
In order to scale the skills available on CLAI, as well as to truly realize the potential of the terminal shell as a platform/environment for the application of AI techniques, we provide a Python3 API for developing skills for CLAI. This is done by making the Bash environment available to an AI developer as a generic environment API (think OpenAI Gym but for the Bash). This abstracts developers away from interfacing with the terminal, and instead focuses their attention and energy on the construction of AI-based skills. The interface to the terminal environment allows for execution of actions and sensing of the result of those actions, in a manner similar to the classic AI agent architecture [Russell & Norvig 1995, Sutton 1992] that AI researchers are already familiar with.
The API, as well as the details on instantiating a new skills, can be found here.
There are a number of ways to contribute and get involved with Project CLAI. First and foremost, we encourage you to take CLAI for a spin – install it, fork the code, and try it out! You can also contribute by adding new features, improving documentation, fixing bugs, or writing tutorials. Furthermore, if you are building a new skill, we strongly encourage you to add it to the CLAI skill catalog so that others may also benefit from it as well as contribute to it and improve it.
In the spirit of the distributed and fast-paced nature of AI projects in the present day, we are also developing a set of CLAI Challenges that will set forth important problems on the terminal shell that can benefit from attention by the AI community at large. Below, we detail the first such challenge:
It is very hard for users to keep track of the arcane flags in commands that are needed on the terminal shell in the course of everyday tasks (like tarring and untarring files, using grep to search for occurrences of text, etc.). The ability to turn natural language instructions into bash commands has been a dream for the research community for a while. After all, there is a lot of data already out there in public forums and in documentation that can be readily leveraged. Combined with recent advances in natural language processing (NLP), this problem has received renewed interest: e.g. NL2Bash, Betty, etc. Most recent attempts are either heavily rule based, or do not scale beyond those examples that can be mined reliably from forums. As such, this remains an open challenge today. As part of Project CLAI, we intend to curate and release an open dataset around this challenge and host a leaderboard of competing solutions. Contribute here.
While this challenge is mainly geared towards the natural language processing community, future challenges, e.g. learning recipes for automation by observing the user on the command line, will be broadening the scope of CLAI challenges to other AI disciplines such as reinforcement learning, planning, and so on.
CLAI is a project very much in progress. We have opened up various avenues for contribution — some of which were described in this post — that each target a different audience and user set. In December 2019, we demonstrated the platform to the broader machine learning community at NeurIPS 2019. We are interested in building a thriving, large-scale community around this project — all in service of ushering the humble terminal shell into the cloud and AI age.
You can come and play with CLAI (and talk to us!) at IBM booth #103 at the AAAI conference, February 7-12, 2020 at the Hilton Midtown hotel in New York, NY.