About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
DEBS 2010
Conference paper
Placement of replicated tasks for distributed stream processing systems
Abstract
We propose an algorithm for placing tasks of data flows for streaming systems onto servers within a message-oriented middleware where certain tasks can be replicated. Our work is centered on the idea that certain transformations are stateless and can therefore be replicated. Replication in this case can cause workloads to be partitioned among multiple machines, thus enabling message processing to be parallelized and lead to improvements in performance. We propose a guided replication approach for this purpose that iteratively computes the optimal placement of replicas where each subsequent iteration of the algorithm takes as input optimal solutions computed in the previous run. As a result, the system performance is consistently improved, which eventually converges as shown in simulation results. We demonstrate, through simulation experiments with both simple and complex task flow graphs and network topologies that introducing our replication mechanism can lead to improvements in runtime performance. When system resources are scarce, the benefits of applying our replication mechanism are even greater. © 2010 ACM.