Publication
Software - Practice and Experience
Paper

Tools and strategies for debugging distributed stream processing applications

View publication

Abstract

Distributed data stream processing applications are often characterized by data flow graphs consisting of a large number of built-in and user-defined operators connected via streams. These flow graphs are typically deployed on a large set of nodes. The data processing is carried out on-the-fly, as tuples arrive at possibly very high rates, with minimum latency. It is well known that developing and debugging distributed, multithreaded, and asynchronous applications, such as stream processing applications, can be challenging. Thus, without domain-specific debugging support, developers struggle when debugging distributed applications. In this paper, we describe tools and language support to support debugging distributed stream processing applications. Our key insight is to view debugging of stream processing applications from four different, but related, perspectives. First, debugging the semantics of the application involves verifying the operatorlevel composition and inspecting the flows at the logical level. Second, debugging the user-defined operators involves traditional source-code debugging, but strongly tied to the stream-level interactions. Third, debugging the deployment details of the application require understanding the runtime physical layout and configuration of the application. Fourth, debugging the performance of the application requires inspecting various performance metrics (such as communication rates, CPU utilization, etc.) associated with streams, operators, and nodes in the system. In light of this characterization, we developed several tools such as a debugger-aware compiler and an associated stream debugger, composition and deployment visualizers, and performance visualizers, as well as language support, such as configuration knobs for logging and tracing, deployment configurations such as operator-to-process and process-to-node mappings, monitoring directives to inspect streams, and special sink adapters to intercept and dump streaming data to files and sockets, to name a few. We describe these tools in the context of Spade-a language for creating distributed stream processing applications, and System S-a distributed stream processing middleware under development at the IBM Watson Research Center. © 2009 by John Wiley & Sons, Ltd.