Tuan Hoang Trong


Tuan Hoang Trong




Staff Research Scientist - Triton lang for FM kernel/WatsonX Data Engineering


IBM Research - Yorktown Heights Yorktown Heights, NY USA



  • key contributor to building the data lake, scaling the creation of process tree (the basis for creating 'document' content), and inference pipeline for "Foundation Models for Cybersecurity" (2023)
  • key contributor to Spark pipeline for "Speed and Scale of Commercialization of Research for Creating the watsonx Platform" (2023) special accomplishment
  • key contributor to "Watson TimeSeries" O-level accomplishment (2023) (with > $100M impact)
  • key contributor (with ~$40M impact to OpenHealth Data Platform) to "Structured Logs for AIOps" A-level accomplishment (2022)
  • contributor to research accomplishment "Scalable and Reusable Impact Science for Huntington's Disease Research" (2020) A-level accomplishment

Since early 2022, Tuan M. Hoang Trong has been the key architecture to create IBM data preprocessing pipeline for IBM WatsonX.data (IBM Data Pile) for WatsonX.ai model training. Tuan's work in building the preprocessing pipeline for CounterStrike data and inference pipeline is the foundation for applying Foundation Models in Cybersecurity, enabling the collaboration between IBM Security Research and IBM CISO office. Tuan is leading the challenge on action recommender using Foundation Models for ITOps. The result was one O-level accomplishment, one A-level accomplishment and one special accomplishment

Since early 2019, Tuan M. Hoang Trong has joined the group of Distributed AI at IBM Research under Mudhakar and then Raghu. Tuan focuses on two aspects. The first is in time-series analysis and visualization in the context of cloud-hosted data via IBM Cloud SQL system (Apache SparkSQL), IBM COS storage. The second is in Federated Learning where the AI model is being used in the context of distributed or unshareable data. He is a self-motivated person and has a deep understanding of a broad range of technologies. The result of this was an A-level Research Accomplishment "Structured Logs for AIOps" (2022).

Tuan joined IBM Research in 2015 as a Postdoctoral Researcher and has been a Research Staff Member since February 2017. In his early career at IBM Research, Tuan extended his interest into computational neuroscience. He worked closely with James Kozloski to enhance IBM's Neural Tissue Simulator (NTS), which enables, for the first time, the capability to simulate synaptic transmission at the spine level with a realistic number of spines on the neuron and extended the graph-based modeling tool to build and run models on CUDA-enabled GPU. He also built the model capability to study calcium dynamics in the neurons, which involve details mechanistic representation of calcium trafficking. The work was part of the larger project to study neurodegenerative diseases, such as Huntington's Disease. Also, an important part of the effort is applying the model in Quantitative Systems Pharmacology (QSP), especially those target synaptic neurotransmission. The result of this was an A-level Research Accomplishment "Scalable and Reusable Impact Science for Huntington's Disease Research" (2020).

Previously, he was an intern at IBM Research in the Cardioid project during Summer 2013. He completed the Ph.D. program in Bioinformatics and Computational Biology at George Mason University, Fairfax, VA in 2014 with M. Saleet Jafri, Ph.D. During this time, he had an interest in using computing technologies to understand biological mechanisms, with emphasis on calcium signaling in cardiac myocytes and in principal neurons of the striatum in the brain. 

His Ph.D. dissertation introduced a patented stochastic algorithm for simulating the Markov-based process efficiently which was applied in modeling stochastic gating of ion channels (L-type calcium channel and ryanodine receptors (RyR)). Based on this technique, he has developed the stochastic cardiac myocyte model (both compartmental and 3D) that can capture the dynamics of stochastic calcium releases form 20,000+ calcium-release units which is currently being used to study different pathophysiological conditions such as Catecholaminergic polymorphic ventricular tachycardia (CPVT), and atrial fibrillation. He also involved in building a tissue model of a few tens of thousands of cells to study calcium-entrained arrhythmia.

CV:  link


There aren’t any IBM publications to show for Tuan Hoang Trong. For a complete publication history, visit Google Scholar.