This paper describes a holistic architecture for super real-time multi-agent simulation platforms by implementing a complete and integrated simulation stack including a simulation runtime and an application layer that can be used for such situations as traffic simulations. With our prototype system, the first experiment tested the performance scalability when simulating millions of agents on 1,536 CPU cores over 256 nodes. By compiling the X10-based agent simulation system into C++ using MPI, we could run 600 simulation steps in only 78 seconds, which is nearly 10 times faster than real time. We then tested a national network spanning Japan, which was able to simulate a 100 million agents and at near-real time while using 128 nodes. This was the first attempt to deal with such a large number of agents and shows that this infrastructure could be used for large-scale agent simulations in various fields. © 2013 IEEE.