Trace-driven simulation is an important aid in performance analysis of computer systems. Capturing address traces to use in these simulations, however, is a difficult problem for parallel processor architectures. A technique termed TRAPEDS modifies executable code (at the assembly language level) to dynamically collect the address trace from executing code. TRAPEDS has recently been implemented on both a hypercube multicomputer and a shared-memory multiprocessor. Particular attention is focused on strategies for efficiently and accurately collecting traces from both classes of parallel machines. The iPSC/2 hypercube multicomputer implementation traces both user and system code, and performs simulation on-the-fly to avoid large storage costs. Strategies are detailed for mitigating address trace distortion when collecting operating system traces. The Encore Multimax multiprocessor implementation uses a timer-based approach to reflect the interleaving of the processor traces and stores the traces to disc. Time and space overhead results are presented for both TRAPEDS implementations. Experimental cache simulation results derived from iPSC/2 address traces are presented to illustrate the importance of tracing operating system references. © 1992.