IBM Research - Israel
Artificial Intelligence
We are innovating and developing core technologies that improve the state-of-the-art in such areas as natural language processing and generation, computer vision, speech technologies, optimization, and AI trust. Our teams create technologies to solve business problems in areas such as customer care, business analytics, process automation, and asset management. The data we handle includes: unstructured data, such as text, images, and speech; semi-structured data; and traditional structured data.
News and Blogs
From unlabeled text to a working classifier in a few hours
Label Sleuth is an open-source tool that lets users with no machine-learning knowledge build a customized text-classification model from scratch. It’s part of IBM’s larger strategy to make time-saving AI tools available to all.
Document Understanding
Our work in the field of Document Understanding and Data Synthesis for OCR and STR.
Entering the age of AI-powered digital employees
IBM Research is working to build a successful digital workforce. To succeed, we need to create AI-empowered and process-aware augmented business process management systems.
IBM’s latest Grand Challenge: An expert computer debater
An interview with Noam Slonim, the principal investigator for Project Debater
Research Projects
Natural Language Generation (NLG) |
Our natural language generation (NLG) projects focus on developing NLG and data storytelling capabilities to improve decision making, productivity, and customer engagement by enabling humans to better communicate data analysis results and insights. Our goal is to develop cutting-edge technologies that are trustful and robust, and thus can be used in commercial large scale products. The work is relevant to a wide range of applications including conversational systems, summarization, data storytelling, paraphrasing, key point analysis, and more. |
Customizable Natural Language Processing (NLP) |
We're researching the efficient creation of text classification models. Using a unique human in the loop approach, these models allow subject matter experts to create high-quality classifiers with no linguistics, machine learning, or programming knowledge. Label Sleuth |
Project Debater / Key Point Analysis |
Project Debater was the first AI system that could debate humans on complex topics. Project Debater technologies are now available for both business and academic use in an Early Access Program. Our team continues to research and develop assets in the fields of opinion analysis and summarization. Our novel Key Point Analysis (KPA) technology identifies the main points from opinionated text in domains ranging from surveys to social media and product reviews. |
Pre-trained Adaptable Language Models |
We are researching new ways to build better pretrained language models and adapting language models to specific tasks and domains. Specifically, we explore new ways to adapt unsupervised models for specific tasks and devise new ways to improve pretrained language models by leveraging existing fine-tuned models. |
Document Conversion |
Business documents are central to many corporate processes and lie at the heart of digital transformation. Such documents include contracts, loan applications, invoices, purchase orders, financial statements, and many more. The information in these documents is presented in natural language and is often unstructured. Understanding these documents is challenging, due to complex document layouts and content such as tables, charts, infographics. It is often even more challenging because of poor quality, noisy scans, or inadequately accurate OCR. |
Computer Vision Research |
Our computer vision research includes learning with limited labels, cross-domain, self-supervised with multi-modal learning, and modern model architectures. |
Conversational Text to Speech |
The voice channel is a crucial element in customer-care scenarios, especially over the phone, and text-to-speech (TTS) systems play a fundamental role in establishing and maintaining a positive customer experience. |
Advanced Speech Classification |
Human speech is a rich signal that carries with it a vast amount of information. In addition to words, the speech signal encodes information about the speaker’s identity, language, accent, emotions, and physical state, that may be particularly useful for analyzing customer speech to improve the service and customer experience. |
Customer Care |
Our team is advancing the research for Watson Assistant, the AI-powered virtual agent that provides customers with fast, consistent, and accurate answers across multiple messaging platforms, applications, devices, and channels. we're helping Watson Assistant learn how to provide even better answers to common questions through the website, social media, chatbots, or with customer support agents. |
AI-powered Business Automation |
Automation improves business performance by making all information-centric jobs more productive; AI accelerates and further scales automation. We discover, generate, and improve business processes with AI, making automation trustworthy for employees and enterprises alike. We automate every enterprise, one at a time, with a focus on asset management, facility management and supply chain processes. Our innovations arrive to market via research lead pilots, IBM’s Automation products and IBM Sustainability Software’s product. |
AI / Machine Learning Quality and Testing |
We develop end-to-end methodologies to identify, predict, and mitigate weaknesses in ML-based systems. Our goal is to develop quality risk assessment and control techniques to understand and mitigate classification challenges of NLP systems, and apply these to practical settings applications such as language classifiers and chatbots. |
Civil Infrastructure |
Maintenance costs are increasing exponentially as structures around the world age. The risk of failure is also increasing drastically requiring these structures to be inspected more frequently and detailed risk assessments performed. We develop automatic AI-based inspection of infrastructures by offering full control over the data capturing process along with the vision and sensor analytics. The automatic data capturing is obtained through drones that carry high resolution cameras and sensors, along with autonomous flight capabilities and AI edge analytics. The automation of the data capturing process using drones enables consistent and repeatable image capture for more frequent assessments while improving the safety of employees. |
Decision Optimization |
We’re passionate about the infusion of structured methods, such as mathematical optimization, game theory and reinforcement learning into decision making processes at scale, thereby using AI to help make better decisions and providing significant benefits to enterprises. |
Data and AI Privacy |
Many privacy regulations, including GDPR, mandate that organizations abide by certain privacy principles when processing personal information. This is also relevant for AI models trained using personal data. We are researching and developing several novel techniques and tools to enable AI-based solutions to adhere to such privacy requirements, including data minimization, anonymization and the right to be forgotten. |
Publications
IBM researchers in Israel publish a wide variety of work every year as part of their work on research projects in the lab, in collaboration with other researchers and scientists in IBM, and together with academic and industrial partners from around the world. Researchers in our group publish works at conferences and in scientific journals such as the AAAI conference, Nature, the ICASSP conference, NeurIPS, and others. |
Tools & Code
Label Sleuth
An open source no-code system for text annotation and building text classifiers.
Project Debater's Early Access Program
We offer free access to these services as Cloud APIs for non commercial academic use. The early access website is available at early-access-program.
Low-Resource Text Classification Framework
A framework for experimenting with text classification tasks, focusing on low-resource scenarios, and examining how active learning (AL) can be used in combination with classification models from Ein-dor et al. (2020) paper.
Intermediate Training using Clustering
Intermediate training of BERT in an unsupervised manger improves topical classification when labeled data is scarce. Code from ACL paper by Shnarch et al. (2022)
AI Privacy and Compliance Toolkit
A toolkit for tools and techniques related to the privacy and compliance of AI models. The anonymization module contains methods for anonymizing ML model training data, so that when a model is retrained on the anonymized data, the model itself will also be considered anonymous. The minimization module contains methods to help adhere to the data minimization principle in GDPR for ML models. It enables to reduce the amount of personal data needed to perform predictions with a machine learning model, while still enabling the model to make accurate predictions. This is done by by removing or generalizing some of the input features.
Academic Collaboration
Collaborate with our researchers on a wide range of NLP (Natural Language Processing) topics ranging from conversational agents and neural information retrieval to computational argumentation. |
Let's talk
We're always happy to talk. Feel free to get in touch.