In order to increase the antitumor efficacy of CAR T-cells, libraries of costimulatory domains have been screened but the costimulatory domains used in engineered CAR T-cells were from natural immune receptors. In this work, the team built a library of costimulatory domains that sample new motif combinations aiming at designing novel synthetic receptors. This approach might yield phenotypes that extend beyond those that can be generated by using native receptor domains alone.
We used the Eukaryotic Linear Motif Database (ELM) and primary literature to select 13 signaling motifs (Fig. A). These motifs are responsible for recruiting key downstream signaling proteins that function in T-cell activation. The synthetic costimulatory domain resulted in a linear domain formed by a sequence of either one, two, or three signaling motifs. The 13 motifs were randomly inserted in the first, second, and the third position of the sequence, yielding 2,379 different motif combinations (Fig. B).
A first set of low-resolution experiments was carried out to confirm that the library displayed sufficient phenotypic diversity. Then, to screen the library at a higher resolution, we randomly selected a subset of over 200 CARs from the combinatorial library and characterized them in an arrayed screen, a technique that allows to study each CAR independently. The results from the arrayed screen show a diverse CAR T-cell cytotoxicity and stemness, implying a complex relationship between signaling motif combinations and arrangement, and the resulting T-cell phenotypes.
To understand better the combinatorial nature of the costimulatory domain library and relate it with the CARs phenotypes we used machine learning algorithms. This allowed us also to make predictions on combinations not present in the arrayed screen experiments. We separated the arrayed screen data into a training (221 examples) and a test set (25 examples) (Fig. C). We then used these data sets to train several machine learning algorithms to predict cytotoxicity and stemness based on costimulatory domain identity and arrangement. Neural networks were able to recapitulate the measured phenotypes and effectively predict the phenotypes in the test set, with R2 values of approximately 0.7 to 0.9 (Fig. D-E).
The trained neural networks then allowed us to predict the CAR T-cell cytotoxicity and stemness that would result from each of the 2,379 motif combinations in the full combinatorial (Fig. E). These simulated 2,379 CARs sample the entire combinatorial space of the library, allowing the analysis of the overall contribution of each motif to a particular phenotype, and the identification of pairwise motif combinations that promote particular phenotypes and of positional dependence of motifs.
The group demonstrated that the combination a signaling motif library and machine learning elucidates the rules of CAR costimulatory signaling and can be used to guide the design of non-natural costimulatory domains with improved phenotypes. Costimulatory domains thus represent engineering targets for customizing or improving cell therapies because they govern the outcome of CAR T-cell activation.
More generally, this type of library may also be useful for identifying combinations of binding, hinge, linker, transmembrane, and signaling domains that produce optimal T-cell function, and assessing the safety and toxicity of such combinations. Exploration of these larger libraries may benefit from machine learning due to the size and complexity of the combinatorial space. Machine learning-augmented screens of this type might be used to engineer many other classes of receptors for biological research and cell therapy applications involving cellular processes controlled by combinations of signaling motifs.