Scheduling intense applications most 'surprising' first
Abstract
Certain streaming applications are required to perform sophisticated analytics within bounded time on arriving streams of data. Such applications have the interesting characteristic that the total amount of work that could be performed is unbounded. We show how recent results from algorithmic theory are useful in scheduling such applications as they allow the efficient creation of synopses of unprocessed data. These synopses can then be used to schedule the processing of the stream. In particular, we describe a preliminary implementation of a scheduler that optimizes the information rate available to applications by estimating the entropy of arriving streams. We describe the theory underlying such a scheduler and motivate how existing programming models can be extended to accommodate it by outlining a basic but functional implementation in the Java programming language.