About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Big Data 2014
Conference paper
The best of two worlds: Integrating IBM InfoSphere Streams with Apache YARN
Abstract
The seamless confluence of data in motion and data at rest has the potential to redefine the Big Data analytics landscape in a diverse range of domains. To make this happen, existing data intensive computing frameworks need to be repurposed and integrated at control, data, and management levels. Towards this end, we present the system level integration of IBM InfoSphere Streams with Apache YARN. Our design leverages the key differentiating features of the two frameworks to blend high throughput batch-processing with near line-rate, low latency stream-processing. In addition, both frameworks are able to share resources and offer the same interfaces that their users are accustomed to. Using two real-world examples, we illustrate how such a system can be used in production.