Technical note
3 minute read

Accelerating discoveries in immunotherapy and disease treatment

Genetically engineered T-cells are a promising new type of medicine, applicable to many forms of cancer, and potentially offer novel approaches to treat infectious diseases and autoimmunity.

In cancer immunotherapy, the main aim is to enhance T-cell anti-tumor cytotoxicity and maintain a stem-like state associated with longer-term T-cell persistence. Such a phenotype is associated with effective and durable tumor clearance — higher stemness is correlated with more resistance to T-cell exhaustion.

At IBM Research, we look for novel approaches to accelerate discoveries in many fields, including immunotherapy and disease treatments. We recently published work in Science reporting on our research into engineering immune cells to address therapeutic challenges.

Sara Capponi and her team of AI researchers study functional genomics and cellular engineering at the Almaden IBM Research lab in Silicon Valley. They have developed a machine-learning algorithm that is able to decode the combinatorial rules of the costimulatory motifs that regulate the outcome of immune cell activation, allowing the prediction of motif combinations of clinically desirable phenotypes associated with effective and durable tumor killing.

Engineering immune cells to kill cancer

T-cells are characterized by receptors located on their surface that sense the outside environment, recognize infectious or cancerous cells, and trigger the signaling processes inside the cell causing an immune response activation.

Chimeric antigen receptors (CARs) are a class of synthetic T-cell receptors that reprogram the phenotypic output of therapeutic T-cells by combining an extracellular domain that is able to identify specific tumor-associated antigen with intracellular motifs that trigger the T-cell activation. Most importantly, the type of motifs included in the costimulatory domains determine the antitumor efficacy of CAR T-cells. These motifs mediate the output of most receptors by recruiting different signaling proteins that initiate distinct system cascades that propagate the signal through the cell.


In order to increase the antitumor efficacy of CAR T-cells, libraries of costimulatory domains have been screened but the costimulatory domains used in engineered CAR T-cells were from natural immune receptors. In this work, the team built a library of costimulatory domains that sample new motif combinations aiming at designing novel synthetic receptors. This approach might yield phenotypes that extend beyond those that can be generated by using native receptor domains alone.

We used the Eukaryotic Linear Motif Database (ELM) and primary literature to select 13 signaling motifs (Fig. A). These motifs are responsible for recruiting key downstream signaling proteins that function in T-cell activation. The synthetic costimulatory domain resulted in a linear domain formed by a sequence of either one, two, or three signaling motifs. The 13 motifs were randomly inserted in the first, second, and the third position of the sequence, yielding 2,379 different motif combinations (Fig. B).

A first set of low-resolution experiments was carried out to confirm that the library displayed sufficient phenotypic diversity. Then, to screen the library at a higher resolution, we randomly selected a subset of over 200 CARs from the combinatorial library and characterized them in an arrayed screen, a technique that allows to study each CAR independently. The results from the arrayed screen show a diverse CAR T-cell cytotoxicity and stemness, implying a complex relationship between signaling motif combinations and arrangement, and the resulting T-cell phenotypes.

To understand better the combinatorial nature of the costimulatory domain library and relate it with the CARs phenotypes we used machine learning algorithms. This allowed us also to make predictions on combinations not present in the arrayed screen experiments. We separated the arrayed screen data into a training (221 examples) and a test set (25 examples) (Fig. C). We then used these data sets to train several machine learning algorithms to predict cytotoxicity and stemness based on costimulatory domain identity and arrangement. Neural networks were able to recapitulate the measured phenotypes and effectively predict the phenotypes in the test set, with R2 values of approximately 0.7 to 0.9 (Fig. D-E).

The trained neural networks then allowed us to predict the CAR T-cell cytotoxicity and stemness that would result from each of the 2,379 motif combinations in the full combinatorial (Fig. E). These simulated 2,379 CARs sample the entire combinatorial space of the library, allowing the analysis of the overall contribution of each motif to a particular phenotype, and the identification of pairwise motif combinations that promote particular phenotypes and of positional dependence of motifs.


The group demonstrated that the combination a signaling motif library and machine learning elucidates the rules of CAR costimulatory signaling and can be used to guide the design of non-natural costimulatory domains with improved phenotypes. Costimulatory domains thus represent engineering targets for customizing or improving cell therapies because they govern the outcome of CAR T-cell activation.

More generally, this type of library may also be useful for identifying combinations of binding, hinge, linker, transmembrane, and signaling domains that produce optimal T-cell function, and assessing the safety and toxicity of such combinations. Exploration of these larger libraries may benefit from machine learning due to the size and complexity of the combinatorial space. Machine learning-augmented screens of this type might be used to engineer many other classes of receptors for biological research and cell therapy applications involving cellular processes controlled by combinations of signaling motifs.