Breadth-First Search(BFS) is one of the most fundamental graph algorithms used as a component of many graph algorithms. Our new method for distributed parallel BFS can compute BFS for one trillion vertices graph within half a second, using large supercomputers such as the K-Computer. By the use of our proposed algorithm, the K-Computer was ranked 1st in Graph500 using all the 82,944 nodes available on June and November 2015 and June 2016 38,621.4 GTEPS. Based on the hybrid-BFS algorithm by Beamer, we devise sets of optimizations for scaling to extreme number of nodes, including a new efficient graph data structure and optimization techniques such as vertex reordering and load balancing. Performance evaluation on the K shows our new BFS is 3.19 times faster on 30,720 nodes than the base version using the previously-known best techniques.