Once nationally ranked at Magic: The Gathering, Yurochkin now spends some of his free time in the mountains skiing.
He has also come up with innovative ways to lower the cost of AI training and AI inferencing. One method he helped develop routes user queries to the most cost-effective LLM. Another harnesses low-rank adapters, or LoRAs, that can be swapped on and off an LLM, like bits in a multi-bit screwdriver, to customize and serve models faster.
“At any point in time, he has more project ideas than he can possibly explore,” says his boss, Dan Gutfreund, an IBM manager based in Cambridge. “Whenever new interns come in, he pulls another idea out of the drawer for them to work on. It’s why researchers and students like working with him.”
A drawerful of ideas
Yurochkin grew up in a city six hours from Moscow, the only child of two professionals. He played chess with his father, and even took classes, but he ended up channeling his competitive drive into Magic: The Gathering, the role-playing fantasy card game. By his senior year in high school, he was ranked among the top 10 players in Russia.
In college, he majored in applied math and physics — the only major offered at Moscow Institute of Physics and Technology. After, he came to the U.S. to study for a PhD in statistics at the University of Michigan, Ann Arbor. There, he was converted to Bayesian thinking by his adviser, Long Nguyen, and learned to code as a way to carry out experiments.
One night in bed, while visualizing geometric shapes in his head, a big idea came to him. He was looking for a quicker way to thematically organize large collections of documents, commonly done at the time with probabilistic topic modeling. You grouped together frequently occurring words to infer underlying themes. Yurochkin realized that a geometric solution could be much simpler and faster than the probabilistic one.
He translated his insight into equations and presented the work at NeurIPS in 2016. A year later, a follow-up paper at the same conference caught the eye of IBM researchers. A few days later, he was offered a job at the MIT-IBM Watson AI Lab, which had just opened with a $240 million investment from IBM.
In Cambridge, Yurochkin connected geometric modeling to open questions in AI. In his most cited paper, he and his colleagues came up with a better way to merge the weights of two independently trained models. “Simply averaging their weights would wreck the combined model,” he said. “But if we account for their permutations, we can get a ‘smarter’ average, and a higher-performing model.”
The building that houses the IBM lab where Yurochkin works, at 314 Main Street (nicknamed the pi building), looks out on the MIT campus. Over the years, he has worked closely with MIT professor Justin Solomon and his students, co-authoring nearly 20 papers together.
“Misha is a brilliant, inclusive collaborator with clever and constructive ideas,” said Solomon. “He stands out for his incredible breadth of knowledge and openness to pursuing research in totally new and unfamiliar areas.”
How to think like a researcher
Yurochkin had an early interest in algorithmic fairness, which looks at how to ‘debug’ AI systems trained on biased or unrepresentative data. In 2022, he led the development of InFairness, an open-source Python library to train and audit machine-learning models for individual fairness, as well as a post-processing tool that corrects biased AI outputs to ensure that individuals with similar qualifications are treated similarly.
Despite his belief in engineering solutions, however, Yurochkin saw a need for more diversity within tech itself. To encourage more women and minorities to join the industry, he started volunteering with an educational nonprofit called Break Through Tech.
Yurochkin has advised five groups of Break Through Tech college students on AI-related projects over the last three years.
This past fall, he finished working with a fifth cohort of Boston-area college students through Break Through Tech. The research projects he has guided them through have ranged from testing the capabilities of ‘small’ language models to deploying LLMs as AP-exam tutors.
“Our project evolved from Misha’s initial proposal, but he still guided us and was constantly teaching us new things,” said Geneva Yang, who helped build the AP-tutor as a junior at Boston University.
When the time came for student presentations, Yurochkin invited Stephanie Soetendal, the founder of a local ed-tech startup Matrix Holograms. Yang was ultimately offered a summer internship at Matrix Holograms, and the experience convinced her to go into ed tech after graduation.
Some of the students had never set foot in a lab before or done original research. Working with Yurochkin in Cambridge, they got to meet other IBM researchers. They also learned how to go from an initial hypothesis to verifiable results, diligently pushing through each obstacle.
“Misha laid out the timeline with clear, achievable milestones, which allowed our team to systematically tackle each step,” says Ishita Kakkar, a senior at the University of Massachusetts, Amherst. “Through him, I learned the importance of meticulous planning for any research project.”
Many of the students later sought out Yurochkin out for career guidance as well as job and graduate school recommendations. “He gave me insights into how to frame my research interests and align them with faculty expertise,” says Kakkar. “That helped me craft a more compelling statement of purpose.” Kakkar was recently accepted to the University of Wisconsin, Madison’s competitive PhD program in computer science.
The benefits go both ways. Working with students clearly energizes Yurochkin. He breaks into a smile as he describes past projects and the jobs and graduate programs his students have gone on to. “The goal is to teach them how to think creatively,” he says, “how to think like a researcher.”
He doesn’t play with Magic cards so much anymore, but when asked where his research was headed next, he barely skipped a beat, as if he had just drawn the winning card that ends the match.
“Reinforcement learning,” he said. “Through reinforcement learning you can teach models new things not by showing them more examples, but by getting them to solve the problem themselves and verify the solution. Scaling reinforcement learning will take LLMs to the next level.”
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.