In this paper we describe an approach to dynamically improve the progress of streaming applications on SMP multi-core systems. We show that run-time task duplication is an effective method for maximizing application throughput in face of changes in available computing resources. Such changes can not be fully handled by static optimizations. We derive a theoretical performance model to identify tasks in need of more computing resources. We propose two on-line algorithms that use indications from the performance model to detect computation bottlenecks. In these algorithms, a task can identify itself as a bottleneck using only its local data. The proposed technique is transparent to end programmers and portable to systems with fair scheduling. Our on-line detection algorithms can be applied to other dynamic scenarios, for example, involving run-time variation of workload. Our experiments using the StreamIt benchmarks  show that the proposed run-time task duplication achieves considerable speedups over the multi-threaded baseline on a 16-core machine and on the scenarios with dynamically changing number of processing cores. We also show that our algorithms achieve better application throughput than alternative approaches for task duplication. © 2012 ACM.