12 Oct 2021
10 minute read

How IBM Research is accelerating discoveries in the fight against COVID-19

From modeling the efficacy of face masks to analyzing just how infectious new COVID variants could be—scientists are using AI and supercomputing to help speed up research as we strive to return to normalcy.

How IBM Research is accelerating discoveries in the fight against COVID-19

From modeling the efficacy of face masks to analyzing just how infectious new COVID variants could be—scientists are using AI and supercomputing to help speed up research as we strive to return to normalcy.

Few things have affected our lives as much as COVID-19 these past two years. But the pandemic has also united the world’s scientific community, aiming to defeat the disease.

At IBM Research, we’ve been looking for ways to help with every aspect of the pandemic. We recently published several papers detailing our research on predicting the infectiousness of new COVID-19 variants, analyzing how effective wearing masks actually is, and using AI to model the spread of the virus.

Early on in the pandemic, we launched the COVID-19 High Performance Computing Consortium with partners across industry, academia, and government. Our goal has been to offer the world supercomputing capability to accelerate research into the virus. Here’s a look at some our recent work that aims to accelerate the speed to mitigate this pandemic and move forward safely.

Face masks and social distancing

Most scientists now recommend wearing some sort of face covering to cut down on the spread of COVID-19. Sara Capponi and Simone Bianco, AI researchers studying functional genomics and cellular engineering in our Almaden lab in Silicon Valley, have used computer simulations to prove masks and social distancing indeed do work. Their research was recently published in Nature Scientific Reports.1

As we now know, people can be asymptomatic and still transmit the disease, which has prompted many community leaders to require wearing face masks in public. Studies2 have found that masks can offer up to an 85% protection against infection, and they’ve become the de facto choice for many.

While there have been studies exploring the efficacy of masks in past public health crises such as H1N1 (commonly referred to as the “swine flu”), most didn’t take into account factors such as social distancing or stay-at-home orders.

Capponi and Bianco, on the other hand, have developed an agent-based model to examine the effectiveness of wearing masks and social distancing, together. Their model simulates common real-life ways of how people get infected when they’re close together, with an average incubation period of 5.1 days for the person to become infectious. The model includes people who are infected and those susceptible to infection, along with those who are socially distant and those moving in close proximity.

In the models, without mask wearing or social distancing, eventually an entire population gets infected. But when 40% of all individuals wear masks, the number of those infected at any given day is reduced by approximately 30%. If 80% of people in the model wore masks, the infection curve flattened significantly—and after a few weeks, the rate of new infections dropped to zero.

The group also studied how social distancing affects the rate of infection. They showed that it slows the spread, but not to the same level as masks. But when the two are combined, meaning when 80% of the population wears masks and 40% are adhering to social distancing, the peak maximum infection rate falls to one-tenth of its height.

The researchers concluded that wearing masks in combination with some degree of social distancing reduces the need for a complete lockdown. They also noted that the effectiveness of controlling the spread of the virus through mask wearing is not reduced if a large fraction of the population is asymptomatic.

The results suggest that, in the absence of universal testing, and given the heterogeneous vaccine deployment across the world, widespread use of face masks is necessary and sufficient to prevent a large outbreak. If 80% the US population, for instance, had taken on a combination of mask wearing and some social distancing at the start of the pandemic, the research suggests that some 65,000 people would have died of COVID-19—rather than the ten times higher figure that we have seen to date.

Predicting the spreadability of variants

With every passing day, the possibility of this coronavirus evolving into new variants beyond the original strain continues to be a threat. In many parts of the world, the so-called Delta variant is now the most common form of the virus, and it’s twice as infectious and more transmissible than the original strain. The longer the virus is out in the world, the more chance there is of new variants that will become harder to quash.

Bianco and Capponi also worked together on another study,3 building an AI-driven prediction model for how well new SARS-Cov-2 variants could bind to human receptors. They wanted to see whether it would be possible to determine how dangerous new variants might be, and how easy it would be to develop new medications to combat them.

The team simulated the virus mutating many times and measured how well it bound to human cells. This binding can be measured computationally, but simulating this data to a scale that would be useful is costly—both in terms of computational resource and time. The AiMOS supercomputer the researchers used takes about a day to simulate about 30 nanoseconds of binding time, and for each model, they needed at least several minutes’ worth of results. Getting there could’ve taken weeks or months, if the supercomputer wasn’t being used for anything else.

So the researchers turned to AI. They built a neural network-based method that uses data from atomistic simulations and is able to predict molecular binding trends, compared to a reference binding.

Using the AI system, the group could discern in just a few hours whether a given COVID-19 variant was better or worse at binding to human cells than the original version of the virus. As new variants emerge, the hope is that this model could help determine the course of action for public health organizations to take. The team has already assessed the Alpha variant that originated in the UK, and it's considered to be between 40% and 80% more transmissible than the original SARS-CoV-2 found in the wild. They used simulations of the Alpha variant, as well as of one of SARS-CoV in the training and validating data set of their model, and they were able to correctly predict binding affinity trends for the Gamma variant as well as two other SARS-CoV mutations.

The researchers are now tackling the Delta variant and keeping track of any future variants that might be of concern. The code for the AI model is available on GitHub for anyone to use.

Contact tracing and epidemiological modeling

The pandemic has also posed unique problems for healthcare workers and policymakers. Many infected people are asymptomatic, making it difficult to know who to test and when, and many countries have had limited testing and vaccine resources. Figuring out when to test people, how to know who to notify after potential exposure, and how many vaccines to order, is expensive, time-consuming, and above all, may have detrimental ramifications.

IBM researchers Shashanka Ubaru, Lior Horesh, and Guy Cohen have developed an effective solution that can be easily used by health practitioners. Their work, published in the Journal of Biomedical Informatics,4 involves a probabilistic model for COVID-19 transmission using individual-level contact tracing information. It’s intended to help healthcare professionals determine when it’s appropriate to issue early warnings to people who have likely been exposed or infected by the virus through a contract-tracing system. The model is based on an expansion of compartmental epidemiology models such as the SEIR (Susceptible, Exposed Infectious, or Recovered) model to include spatial information obtained from contact tracing.

The model attempts to identify people who are likely to be asymptomatic, given that they don’t change their behavior—like self-isolating—unless told to. The research offers a tool for weighing between immediately intervening to stop the spread of the disease and a more informed assessment of the overall state of the pandemic. The results could help policymakers estimate the budget for testing and optimize vaccine distribution rollout.

The research relies on advanced AI capabilities, including learning over dynamic graphs. The system learns the probabilistic infection state of each individual in the model, based on a subset of tests and interactions in the form of dynamics graphs constructed from contact-tracing data. The AI also figures out the best way to prescribe limited resources such as tests, accounting for the risk level of individuals—in particular, asymptomatic ones—in an environment rife with uncertainty.

The group also used mathematical approaches such as polynomial chaos expansion and tensor algebra. The former avoids taking any prior assumptions about the statistics while also offering scalable way to determine the possible outcomes when unknowns are involved. In short, it helps to quantify uncertainties in a model.

Meanwhile, tensor algebra helped the scientists model every individual as having some probability of being susceptive, exposed, infected or recovered in relation to COVID-19. That assessment for each individual evolves over time, resulting in a three-dimensional model that tensor algebra affords.


  1. Catching, A., Capponi, S., Yeh, M.T. et al. Examining the interplay between face mask usage, asymptomatic transmission, and social distancing on the spread of COVID-19. Sci Rep 11, 15998 (2021).

  2. Leung, N.H.L., Chu, D.K.W., Shiu, E.Y.C. et al. Respiratory virus shedding in exhaled breath and efficacy of face masks. Nat Med 26, 676–680 (2020).

  3. Variant paper here: AI-driven prediction of SARS-CoV-2 variant binding trends from atomistic simulations

  4. Ubaru, S., Horesh, L., Cohen, G. Dynamic graph and polynomial chaos based models for contact tracing data analysis and optimal testing prescription. Journal of Biomedical Informatics 122, 103901 (2021).