9 minute read

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI 2021), researchers at IBM will present five papers, two workshop papers, and two demos. In addition, we have organized three workshops across multiple key areas of IUI, including automated data science, explainable AI, conversational interfaces, generative AI, and human-agent interaction. At IBM Research, we believe that AI systems will always contain a human element in order to ensure that these systems are fair and unbiased, robust and secure, and applied ethically and in service to the needs of their users. Our human-centered approach to AI helps us understand who we are building AI systems for and evaluating how well those systems are working for their end users.

Automated Data Science

The automation of machine learning and data science is an important topic in the IUI community. At IBM, we are developing technologies that make it easier for data scientists to produce high-quality models by automating different steps of the data science pipeline: labeling for supervised machine learning tasks, joining disparate data sets and cleaning data, engineering features, crafting neural network architectures, tuning hyperparameters, and evaluating models for fairness and robustness. Collectively, these technologies are known as IBM AutoAI and are available for use in IBM Watson Studio.

Labeling data is an important step in the supervised machine learning lifecycle. It is a laborious human activity comprised of repeated decision making: the human labeler decides which of several potential labels to apply to each data point. IBM researchers designed an AI labeling assistant that uses a semi-supervised learning algorithm to predict the most probable labels for each example. In their study, “Increasing the Speed and Accuracy of Data Labeling Through an AI Assisted Interface,” they found that providing assistance via a label recommendation reduced the labeler’s decision space by focusing their attention on only the most probable labels. This technique improved labelers’ speed and accuracy, especially when the labeler found the correct label in the reduced label space.

Domain knowledge is often necessary for various stages of machine learning (ML) model development; however, data scientists face a steep learning curve working in a new domain. IBM researchers developed a tool called Ziva that facilitates knowledge sharing from domain experts to data scientists for developing NLP models. Ziva was informed by an interview study with data scientists to understand common types of domain knowledge useful for building NLP models. A case study showcased how Ziva was able to support domain knowledge sharing while maintaining low mental load and stress levels. Ziva helped data scientists learn essential information about the domain and facilitate various tasks in building NLP models, including bootstrapping labels and improving feature engineering.

When using current automated data science systems, including IBM AutoAI, data scientists must select a suitable model from a set of candidate models produced by the AI. Currently, data scientists select these models based on performance metrics such as accuracy or precision. However, there are other ways to compare models and how they make decisions, such as by examining which features contribute to a model’s decision, the types of errors a model makes, and why. To make model selection a more transparent process, IBM researchers developed Model LineUpper, an interactive tool that integrates multiple explainable AI (XAI) and visualization techniques. In a user study, participants gave high ratings to the usability of Model LineUpper and welcomed the idea of comparing models by understanding their decision logic. This work also provides design implications for utilizing XAI techniques for model comparison and supporting the unique user needs of automated data science systems.

Explainable AI

Smart algorithmic systems that apply complex reasoning to make decisions, such as decision support or recommender systems, are difficult for people to understand. Algorithms allow the exploitation of rich and varied data sources to support human decision-making; however, there are increasing concerns surrounding their fairness, bias, and accountability, as these processes are typically opaque to users. The Workshop on Transparency and Explanations in Smart Systems (TExSS) will bring together researchers exploring transparency, fairness, and accountability when designing, developing, and evaluating intelligent user interfaces, with a specific focus on how the design of these interfaces can support social justice causes. Participants in the TExSS Workshop will discuss the role of Explainable AI (XAI) in decision-making scenarios, their visions of AI-enhanced decision-making processes, and explore how XAI impacts how people justify their decisions.

IBM researchers will present a demo of XNLP, an interactive survey of recent state-of-the-art research in the field of Explainable AI within the domain of Natural Language Processing (XAI-NLP). XNLP is designed to be an online data hub of curated and organized knowledge extracted from carefully reviewed academic works. The system visually organizes and illustrates XAI-NLP publications and distills their content to allow users to gain insights, generate ideas, and explore the field.

Generative AI

New generative techniques, such as unsupervised neural machine translation (NMT), have recently been applied to the task of generating source code by translating it from one programming language to another. But, because of the probabilistic nature of generative models, the code produced in this way may contain imperfections such as compilation or logical errors. IBM researchers will present a study, “Perfection Not Required? Human-AI Partnerships in Code Translation,” in which they examined whether software engineers would tolerate such imperfections, and ways to aid them in detecting and correcting these errors. This study highlights how UI features such as confidence highlighting and alternate translations can help software engineers work productively with generative NMT models.

Recent advances in generative AI have resulted in a rapid and dramatic increase in the fidelity of created artifacts, from realistic-looking images of faces and deep-fake videos of prominent business leaders to antimicrobial peptide sequences that treat diseases. The second Workshop on Human-AI Co-Creation with Generative Models will bring together HCI and AI researchers and practitioners to explore and better understand the opportunities and challenges in building, using, and evaluating human-AI co-creative systems.

In a paper in this workshop, IBM researchers report results from a controlled experiment in which data scientists used multiple models — including a GNN-based generative model — to generate and subsequently edit documentation for data science code within Jupyter notebooks. In analyzing their edit patterns, they discovered various ways that humans made improvements to the AI-generated documentation.

AI-Driven Interfaces: Conversational, Games, and Others

IBM researchers investigated numerous AI-driven interfaces, including conversational agents, a game to study backdoor poisoning attacks, automated table extraction from PDFs, and recommender systems for identifying business partners. The CUI@IUI: Theoretical and Methodological Challenges in Intelligent Conversational User Interface Interactions Workshop will bring together the Intelligent User Interface (IUI) and Conversational User Interface (CUI) communities to understand the theoretical and methodological challenges in designing, deploying, and evaluating CUIs.

Table extraction from PDF and image documents is a ubiquitous task in the real-world. Perfect extraction quality is difficult to achieve with one single out-of-box model due to the wide variety of table styles, the lack of training data representing this variety, and the inherent ambiguity and subjectivity of table definitions. IBM researchers developed TableLab, a system in which users quickly customize high-quality table extraction models with a few labelled examples for the user’s document collection. Given an input document collection, TableLab first detects tables with similar structures by clustering embeddings from the extraction model.

Backdoor attacks are a process through which an adversary creates a vulnerability in a machine learning model by “poisoning’’ the training set by selectively mislabelling images containing a backdoor object. The model continues to perform well on standard testing data but misclassifies on the inputs that contain the backdoor chosen by the adversary. For example, by putting a backdoor trigger (e.g. sunflowers) in the backgrounds of the images of cats in a training data set meant to classify cats and dogs, they can make the model misclassify cats as dogs. IBM researchers present the design and development of the Backdoor Game, in which users can interact with different poisoned classifiers and upload their own images containing backdoor objects in an engaging way. The combined design, development, and deployment of this game will help AI security researchers study this emerging concept helping increase the safety of future AI systems.

Business partnerships can help businesses deliver on opportunities they might otherwise be unable to facilitate. Finding the right business partner involves understanding the needs of the businesses along with what they can deliver in a collaboration. Business partner recommendation systems meet this need by facilitating the process of finding the right collaborators to initiate a partnership. In a workshop paper, IBM researchers present a real-world business partner recommender system which uses a similarity-based technique to generate and explain business partner suggestions. This application dynamically combines different recommender algorithms and explanations to improve the user’s experience with the tool. They present preliminary findings from focus groups conducted to evaluate the tool.

Accepted Papers

  • Increasing the Speed and Accuracy of Data Labeling Through an AI Assisted Interface. link
  • Facilitating knowledge sharing from domain experts to data scientists for building NLP models. link
  • Model LineUpper: Supporting Interactive Model Comparison at Multiple Levels for AutoML. link
  • The Design and Development of a Game to Study Backdoor Poisoning Attacks: The Backdoor Game. link
  • Perfection Not Required? Human-AI Partnerships in Code Translation. link

Accepted Demos

  • TableLab: An Interactive Table Extraction System with Adaptive Deep Learning. link
  • XNLP: A Living Survey for Explainable AI Research in Natural Language Processing. link


  • HAI-GEN 2021: 2nd Workshop on Human-AI Co-Creation with Generative Models. link
  • CUI@IUI: Theoretical and Methodological Challenges in Intelligent Conversational User Interface Interactions. link
  • Transparency and Explanations in Smart Systems (TExSS). link

Workshop Papers

  • How Data Scientists Improve Generated Code Documentation in Jupyter Notebooks. pdf
  • The human-AI relationship in decision-making: AI explanation to support people on justifying their decisions. pdf
  • Making Business Partner Recommendation More Effective: Impacts of Combining Recommenders and Explanations through User Feedback. link