Pervasive and Mobile Computing

ClariSense+: An enhanced traffic anomaly explanation service using social network feeds

View publication


The explosive growth in social networks that publish real-time content begs the question of whether their feeds can complement traditional sensors to achieve augmented sensing capabilities. One such capability is to explain anomalous sensor readings. In our previous conference paper, we built an automated anomaly clarification service, called ClariSense, with the ability to explain sensor anomalies using social network feeds (from Twitter). In this extended work, we present an enhanced anomaly explanation system that augments our base algorithm by considering both (i) the credibility of social feeds and (ii) the spatial locality of detected anomalies. The work is geared specifically for describing small-footprint anomalies, such as vehicular traffic accidents. The original system used information gain to select more informative microblog items to explain physical sensor anomalies. In this paper, we show that significant improvements are achieved in our ability to explain small-footprint anomalies by accounting for information credibility and further discriminating among high-information-gain items according to the size of their spatial footprint. Hence, items that lack sufficient corroboration and items whose spatial footprint in the blogosphere is not specific to the approximate location of the physical anomaly receive less consideration. We briefly demonstrate the workings of such a system by considering a variety of real-world anomalous events, and comparing their causes, as identified by ClariSense+, to ground truth for validation. A more systematic evaluation of this work is done using vehicular traffic anomalies. Specifically, we consider real-time traffic flow feeds shared by the California traffic system. When flow anomalies are detected, our system automatically diagnoses their root cause by correlating the anomaly with feeds on Twitter. For evaluation purposes, the identified cause is then retroactively compared to official traffic and incident reports that we take as ground truth. Results show a great correspondence between our automatically selected explanations and ground-truth data.