Scalable analytics to detect DNS misuse for establishing stealthy communication channels

Douglas L. Schales; Jiyong Jang; Ting Wang; Xin Hu; Dhilung Kirat; B. Wuest; Marc Stoecklin

doi:10.1147/JRD.2016.2557639

IBM J. Res. Dev

Paper

01 Jul 2016

Scalable analytics to detect DNS misuse for establishing stealthy communication channels

View publication

Abstract

The Domain Name System (DNS) protocol is one of the few application protocols that are allowed to cross network perimeters of organizations. However, comprehensive monitoring of DNS traffic has been often overlooked in many organizations' cybersecurity strategies. As such, DNS provides a highly attractive channel for advanced threat actors and botnet operators to establish hard-to-block and stealthy communication channels between infected devices and command-and-control (CC) infrastructures. Fast-fluxing (FF) and domain name generation algorithms (DGAs) are two well-known public DNS exploitation techniques to build agile CC infrastructures. The detection of FF and DGA domain names is a big data problem, as it requires analyzing millions of DNS queries and replies over extended time periods. In this paper, we propose two algorithms to perform DNS analytics and effectively detect FF and DGA domain names. More importantly, we describe how the algorithms are implemented using two big data processing models: MapReduce and Feature Collection and Correlation Engine. The algorithms and implementation proposed are iterative and scale over long analysis periods. We describe the implementations and provide an evaluation complemented with case studies on 50 days of real-world DNS data consisting of more than 40 billion events, collected within a large corporate network.

Conference paper