Maximizing the data utility of a data archiving & querying system through joint coding and scheduling
Abstract
We study a joint scheduling and coding problem for collecting multi-snapshots spatial data in a resource constrained sensor network. Motivated by a distributed coding scheme for single snapshot data collection [7], we generalize the scenario to include multi-snapshots and general coding schemes. Associating a utility function with the recovered data, we aim to maximize the expected utility gain through joint coding and scheduling. We first assume non-mixed coding where coding is only allowed for data of the same snapshot. We study the problem of how to schedule (or prioritize) the codewords from multiple snapshots under an archiving model where data from all snapshots are of interests with additive utilities. We formalize the scheduling problem into a Multi-Armed Bandit (MAB) problem. We derive the optimal solution using Gittins Indices, and identify conditions under which a greedy algorithm is optimal. We then consider random mixed coding where data from different snapshots are randomly coded together. We generalize the growth codes in [7] to arbitrary linear-codes-based random mixed coding and show that there exists an optimal degree of coding. Various practical issues and the buffer size impact on performance are then discussed. Copyright 2007 ACM.