This work presents a game-theoretic formulation of multi-agent curriculum learning to improve agent learning and provide game equilibrium insights. The learning is defined by a leader - follower cooperative game. Under this setup the leader can choose among several MDPs which one is the best one given follower’s actions. Each follower then chooses how it will be solving its task using an algorithm that combines opponent modelling techniques (estimates of leader’s and other followers’ actions) and reinforcement learning. We observed that under this framework in the agents needs only a small number of epochs to converge to a desired solution, compared to the reinforcement learning agent baseline.