Shuang Chen, Herbert Freeman
International Journal of Pattern Recognition and Artificial Intelligence
We perform a detailed flop and bandwidth analysis of Jos Stam's Stable Fluids algorithm on the CPU, GPU, and Cell. In all three cases, we find that the algorithm is bandwidth bound, with the cores sitting idle up to 96% of the time. Knowing this, we propose two modifications to accelerate the algorithm. First, a Mehrstellen discretization for the pressure solver which reduces the running time of the solver by a third. Second, a static caching scheme that eliminates roughly 99% of the random lookups in the advection stage. We observe a 2x speedup in the advection stage using this scheme. Both modifications apply equally well to all three architectures. Copyright © 2008 by the Association for Computing Machinery, Inc.
Shuang Chen, Herbert Freeman
International Journal of Pattern Recognition and Artificial Intelligence
Robert Farrell, Rajarshi Das, et al.
AAAI-SS 2010
Wei Zhang, Timothy Wood, et al.
ICAC 2014
Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025