Introducing AI Explainability 360

We are pleased to announce AI Explainability 360, a comprehensive open source toolkit of state-of-the-art algorithms that support the interpretability and explainability of machine learning models. We invite you to use it and contribute to it to help advance the theory and practice of responsible and trustworthy AI.

Machine learning models are demonstrating impressive accuracy on various tasks and have gained widespread adoption. However, many of these models are not easily understood by the people that interact with them. This understanding, referred to as “explainability” or “interpretability,” allows users to gain insight into the machine’s decision-making process. Understanding how things work is essential to how we navigate the world around us and is essential to fostering trust and confidence in AI systems.

Making AI more trusted, by making it explainable

Further, AI explainability is increasingly important among business leaders and policymakers. In fact, 68 percent of business leaders believe that customers will demand more explainability from AI in the next three years, according to an IBM Institute for Business Value survey.

No single approach to explaining algorithms

To provide explanations in our daily lives, we rely on a rich and expressive vocabulary: we use examples and counterexamples, create rules and prototypes, and highlight important characteristics that are present and absent.

When interacting with algorithmic decisions, users will expect and demand the same level of expressiveness from AI. A doctor diagnosing a patient may benefit from seeing cases that are very similar or very different. An applicant whose loan was denied will want to understand the main reasons for the rejection and what she can do to reverse the decision. A regulator, on the other hand, will not probe into only one data point and decision, she will want to understand the behavior of the system as a whole to ensure that it complies with regulations. A developer may want to understand where the model is more or less confident as a means of improving its performance.

As a result, when it comes to explaining decisions made by algorithms, there is no single approach that works best. There are many ways to explain. The appropriate choice depends on the persona of the consumer and the requirements of the machine learning pipeline.

AI Explainability 360 tackles explainability in a single interface

It is precisely to tackle this diversity of explanation that we’ve created AI Explainability 360 with algorithms for case-based reasoning, directly interpretable rules, post hoc local explanations, post hoc global explanations, and more. Given that there are so many different explanation options, we have created helpful resources in a single place:

an interactive experience that provides a gentle introduction through a credit scoring application;
several detailed tutorials toeducate practitioners on how to inject explainability in other high-stakes applications such as clinical medicine, healthcare management and human resources;
documentation that guides the practitioner on choosing an appropriate explanation method.

The toolkit has been engineered with a common interface for all of the different ways of explaining (not an easy feat) and is extensible to accelerate innovation by the community advancing AI explainability. We are open sourcing it to help create a community of practice for data scientists, policymakers, and the general public that need to understand how algorithmic decision making affects them. AI Explainability 360 differs from other open source explainability offerings¹ through the diversity of its methods, focus on educating a variety of stakeholders, and extensibility via a common framework. Moreover, it interoperates with AI Fairness 360 and Adversarial Robustness 360, two other open-source toolboxes from IBM Research released in 2018, to support the development of holistic trustworthy machine learning pipelines.

The initial release contains eight algorithms recently created by IBM Research, and also includes metrics from the community that serve as quantitative proxies for the quality of explanations. Beyond the initial release, we encourage contributions of other algorithms from the broader research community.

We highlight two of the algorithms in particular. The first, Boolean Classification Rules via Column Generation, is an accurate and scalable method of directly interpretable machine learning that won the inaugural FICO Explainable Machine Learning Challenge. The second, Contrastive Explanations Method, is a local post hoc method that addresses the most important consideration of explainable AI that has been overlooked by researchers and practitioners: explaining why an event happened not in isolation, but why it happened instead of some other event.

AI Explainability 360 complements the ground-breaking algorithms developed by IBM Research that went into Watson OpenScale. Released last year, the platform helps clients manage AI transparently throughout the full AI lifecycle, regardless of where the AI applications were built or in which environment they run. OpenScale also detects and addresses bias across the spectrum of AI applications, as those applications are being run.

Our team includes members from IBM Research from around the globe.[^2] We are a diverse group in terms of national origin, scientific discipline, gender identity, years of experience, appetite for vindaloo, and innumerable other characteristics, but we share a belief that the technology we create should uplift all of humanity and ensure the benefits of AI are available to all.

The toolkit includes algorithms and metrics from the following papers:

David Alvarez-Melis and Tommi Jaakkola, “Towards Robust Interpretability with Self-Explaining Neural Networks”, Conference on Neural Information Processing Systems, 2018.
Sanjeeb Dash, Oktay Günlük, and Dennis Wei, “Boolean Decision Rules via Column Generation”, Conference on Neural Information Processing Systems, 2018.
Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, and Payel Das, “Explanations Based on the Missing: Towards Contrastive Explanations with Pertinent Negatives”, Conference on Neural Information Processing Systems, 2018.
Amit Dhurandhar, Karthikeyan Shanmugam,Ronny Luss, and Peder Olsen, “Improving Simple Models with Confidence Profiles”, Conference on Neural Information Processing Systems, 2018.
Karthik S. Gurumoorthy, Amit Dhurandhar, and Guillermo Cecchi, and Charu Aggarwal. Efficient Data Representation by Selecting Prototypes with Importance Weights. IEEE International Conference on Data Mining (ICDM), 2019.
Michael Hind, Dennis Wei, Murray Campbell, Noel C. F. Codella, Amit Dhurandhar, Aleksandra Mojsilović, Karthikeyan Natesan Ramamurthy, and Kush R. Varshney, “TED: Teaching AI to Explain Its Decisions”, AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, 2019.
Abhishek Kumar, Prasanna Sattigeri, and Avinash Balakrishnan, “Variational Inference of Disentangled Latent Concepts from Unlabeled Data”, International Conference on Learning Representations, 2018.
Ronny Luss, Pin-Yu Chen, Amit Dhurandhar, Prasanna Sattigeri, Karthikeyan Shanmugam, and Chun-Chen Tu, “Generating Contrastive Explanations with Monotonic Attribute Functions”, 2019.
Dennis Wei, Sanjeeb Dash, Tian Gao, and Oktay Günlük, “Generalized Linear Rule Models”, International Conference on Machine Learning, 2019.