We present GLB, a programming model and an associated implementation that can handle a wide range of irregular parallel programming problems running over large-scale dis- tributed systems. GLB is applicable both to problems that are easily load-balanced via static scheduling and to prob- lems that are hard to statically load balance. GLB hides the intricate synchronizations (e.g., inter-node communication, initialization and startup, load balancing, termination and re- sult collection) from the users. GLB internally uses a version of the lifeline graph based work-stealing algorithm proposed by Saraswat et al . Users of GLB are simply required to write several pieces of sequential code that comply with the GLB interface. GLB then schedules and orchestrates the par- allel execution of the code correctly and efficiently at scale. We have applied GLB to two representative benchmarks: Betweenness Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be statically load-balanced whereas UTS cannot. In either case, GLB scales well - achieving nearly linear speedup on different computer archi- tectures (Power, Blue Gene/Q, and K) - up to 16K cores. Copyright © 2014 ACM.