Vectorization for SIMD architectures with alignment constraints
Alexandre E. Eichenberger, Peng Wu, et al.
PLDI 2004
This paper describes an end-to-end system implementation of a transactional memory (TM) programming model on top of the hardware transactional memory (HTM) of the Blue Gene/Q machine. The TM programming model supports most C/C++ programming constructs using a best-effort HTM and the help of a complete software stack including the compiler, the kernel, and the TM runtime. An extensive evaluation of the STAMP and the RMS-TM benchmark suites on BG/Q is the first of its kind in understanding characteristics of running TM workloads on real hardware TM. The study reveals several interesting insights on the overhead and the scalability of BG/Q HTM with respect to sequential execution, coarse-grain locking, and software TM.
Alexandre E. Eichenberger, Peng Wu, et al.
PLDI 2004
Dibyendu Das, Peng Wu
IPDPS 2010
Peng Wu, Maged M. Michael, et al.
CCPE
Peng Wu, Hiroshige Hayashizaki, et al.
OOPSLA 2011