IBM Research - Israel

Artificial Intelligence

We are innovating and developing core technologies that improve the state-of-the-art in such areas as natural language processing and generation, computer vision, speech technologies, optimization, and AI trust. Our teams create technologies to solve business problems in areas such as customer care, business analytics, process automation, and asset management. The data we handle includes: unstructured data, such as text, images, and speech; semi-structured data; and traditional structured data.


News and Blogs

From unlabeled text to a working classifier in a few hours

Label Sleuth is an open-source tool that lets users with no machine-learning knowledge build a customized text-classification model from scratch. It’s part of IBM’s larger strategy to make time-saving AI tools available to all.

Document Understanding

Our work in the field of Document Understanding and Data Synthesis for OCR and STR.

Entering the age of AI-powered digital employees

IBM Research is working to build a successful digital workforce. To succeed, we need to create AI-empowered and process-aware augmented business process management systems.

IBM’s latest Grand Challenge: An expert computer debater

An interview with Noam Slonim, the principal investigator for Project Debater

Research Projects

Natural Language Generation (NLG)

Our natural language generation (NLG) projects focus on developing NLG and data storytelling capabilities to improve decision making, productivity, and customer engagement by enabling humans to better communicate data analysis results and insights. Our goal is to develop cutting-edge technologies that are trustful and robust, and thus can be used in commercial large scale products. The work is relevant to a wide range of applications including conversational systems, summarization, data storytelling, paraphrasing, key point analysis, and more.

Customizable Natural Language Processing (NLP)

We're researching the efficient creation of text classification models. Using a unique human in the loop approach, these models allow subject matter experts to create high-quality classifiers with no linguistics, machine learning, or programming knowledge.

Label Sleuth
As part of an ‘annotation system that works for you’, we developed a no-code system that lets domain experts with no programming skills (e.g., doctors, lawyers) build text classifiers in just a few hours. With an intuitive user interface, the system acts as a machine learning expert to guide the user towards an efficient annotation process and through the model building process.

Read more about NLP at IBM Research

Read more about Label Sleuth

Watch Label Sleuth's demo

Project Debater / Key Point Analysis

Project Debater was the first AI system that could debate humans on complex topics. Project Debater technologies are now available for both business and academic use in an Early Access Program. Our team continues to research and develop assets in the fields of opinion analysis and summarization. Our novel Key Point Analysis (KPA) technology identifies the main points from opinionated text in domains ranging from surveys to social media and product reviews.

Visit Project Debater

Access the Early Access Program

Learn about Key Point Analysis

Pre-trained Adaptable Language Models

We are researching new ways to build better pretrained language models and adapting language models to specific tasks and domains. Specifically, we explore new ways to adapt unsupervised models for specific tasks and devise new ways to improve pretrained language models by leveraging existing fine-tuned models.

Document Conversion

Business documents are central to many corporate processes and lie at the heart of digital transformation. Such documents include contracts, loan applications, invoices, purchase orders, financial statements, and many more. The information in these documents is presented in natural language and is often unstructured. Understanding these documents is challenging, due to complex document layouts and content such as tables, charts, infographics. It is often even more challenging because of poor quality, noisy scans, or inadequately accurate OCR.
The ability to read these business documents, either programmatically or by OCR, interpret their content so that it can be used in downstream automatic business processes is referred to as Document Understanding. We are treating this as a multi-disciplinary challenge, spanning across computer vision as well as natural language understanding, information representation, model optimization, thus advancing the state of the art in document understanding.

Read about Deep Document Understanding

Computer Vision Research

Our computer vision research includes learning with limited labels, cross-domain, self-supervised with multi-modal learning, and modern model architectures.

Read more about our academic collaboration

Conversational Text to Speech

The voice channel is a crucial element in customer-care scenarios, especially over the phone, and text-to-speech (TTS) systems play a fundamental role in establishing and maintaining a positive customer experience.
We are developing a conversational end-to-end text-to-speech being used in conversational voice agents for customer-care. By designing and recording a speech corpus with conversational content, expressive speaking styles, and interjections, and by employing innovative deep learning and data augmentation techniques, our conversational TTS system can produce human sounding expressive spoken machine responses in a variety of voices.

Advanced Speech Classification

Human speech is a rich signal that carries with it a vast amount of information. In addition to words, the speech signal encodes information about the speaker’s identity, language, accent, emotions, and physical state, that may be particularly useful for analyzing customer speech to improve the service and customer experience.
We are developing advanced speech classification technology, based on state-of-the-art self-supervised speech representations. The technology enables the computer to accurately identify elements such as the customer’s language or emotion, in customer-care calls with either a human agent or a voice bot.

Customer Care

Our team is advancing the research for Watson Assistant, the AI-powered virtual agent that provides customers with fast, consistent, and accurate answers across multiple messaging platforms, applications, devices, and channels. we're helping Watson Assistant learn how to provide even better answers to common questions through the website, social media, chatbots, or with customer support agents.

Read more

AI-powered Business Automation

Automation improves business performance by making all information-centric jobs more productive; AI accelerates and further scales automation. We discover, generate, and improve business processes with AI, making automation trustworthy for employees and enterprises alike. We automate every enterprise, one at a time, with a focus on asset management, facility management and supply chain processes. Our innovations arrive to market via research lead pilots, IBM’s Automation products and IBM Sustainability Software’s product.

AI / Machine Learning Quality and Testing

We develop end-to-end methodologies to identify, predict, and mitigate weaknesses in ML-based systems. Our goal is to develop quality risk assessment and control techniques to understand and mitigate classification challenges of NLP systems, and apply these to practical settings applications such as language classifiers and chatbots.

Read more about testing for AI systems

Civil Infrastructure

Maintenance costs are increasing exponentially as structures around the world age. The risk of failure is also increasing drastically requiring these structures to be inspected more frequently and detailed risk assessments performed. We develop automatic AI-based inspection of infrastructures by offering full control over the data capturing process along with the vision and sensor analytics. The automatic data capturing is obtained through drones that carry high resolution cameras and sensors, along with autonomous flight capabilities and AI edge analytics. The automation of the data capturing process using drones enables consistent and repeatable image capture for more frequent assessments while improving the safety of employees.

Decision Optimization

We’re passionate about the infusion of structured methods, such as mathematical optimization, game theory and reinforcement learning into decision making processes at scale, thereby using AI to help make better decisions and providing significant benefits to enterprises.

Read about Decision Optimization - academic collaboration

Data and AI Privacy

Many privacy regulations, including GDPR, mandate that organizations abide by certain privacy principles when processing personal information. This is also relevant for AI models trained using personal data. We are researching and developing several novel techniques and tools to enable AI-based solutions to adhere to such privacy requirements, including data minimization, anonymization and the right to be forgotten.

Learn more about the AI Privacy and Compliance toolkit


IBM researchers in Israel publish a wide variety of work every year as part of their work on research projects in the lab, in collaboration with other researchers and scientists in IBM, and together with academic and industrial partners from around the world.

Researchers in our group publish works at conferences and in scientific journals such as the AAAI conference, Nature, the ICASSP conference, NeurIPS, and others.

Tools & Code

Label Sleuth

An open source no-code system for text annotation and building text classifiers.

View project

Project Debater's Early Access Program

We offer free access to these services as Cloud APIs for non commercial academic use. The early access website is available at early-access-program.

View project

Low-Resource Text Classification Framework

A framework for experimenting with text classification tasks, focusing on low-resource scenarios, and examining how active learning (AL) can be used in combination with classification models from Ein-dor et al. (2020) paper.

View project

Intermediate Training using Clustering

Intermediate training of BERT in an unsupervised manger improves topical classification when labeled data is scarce. Code from ACL paper by Shnarch et al. (2022)

View project

AI Privacy and Compliance Toolkit

A toolkit for tools and techniques related to the privacy and compliance of AI models. The anonymization module contains methods for anonymizing ML model training data, so that when a model is retrained on the anonymized data, the model itself will also be considered anonymous. The minimization module contains methods to help adhere to the data minimization principle in GDPR for ML models. It enables to reduce the amount of personal data needed to perform predictions with a machine learning model, while still enabling the model to make accurate predictions. This is done by by removing or generalizing some of the input features.

View project

Academic Collaboration

Collaborate with our researchers on a wide range of NLP (Natural Language Processing) topics ranging from conversational agents and neural information retrieval to computational argumentation.

Let's talk

We're always happy to talk. Feel free to get in touch.

Manager, AI

Computer vision