Sahil Suneja, Yufan Zhuang, et al.
ACM TOSEM
Building reliable applications that leverage large language models (LLMs) remains a significant challenge. While LLMs offer impressive capabilities across diverse tasks, their outputs often lack accuracy and provide no clear measure of confidence. This uncertainty compounds in flows of multiple calls to LLMs and other tools, making it difficult for developers and end-users to trust the results. This paper introduces a probabilistic language for programming LLM-based flows. It enables developers to quantify and propagate uncertainty throughout the application's flow, and experiment with different inference scaling techniques without adding a single line of code beyond the flow's logic. We present an experimental study to demonstrate this capability, and a case study building a theorem proving agent for the Rocq theorem prover.
Sahil Suneja, Yufan Zhuang, et al.
ACM TOSEM
Chih-kai Ting, Karl Munson, et al.
AAAI 2023
Toshiaki Yasue, Kohichi Ono, et al.
ICSE 2026
Amit Dhurandhar, Vijil Vijil, et al.
ICML 2026