DeviceMien: Network device behavior modeling for identifying unknown IoT devices
With the explosion of IoT device use, networks are becoming more vulnerable to attack. Network administrators need better tools to verify and discover these devices in order to minimize attack risk. Existing tools provide rule-based assessment capabilities that cannot keep pace with the proliferation of devices. Current techniques demonstrate that given a rich set of labeled packet traces, one could design a pipeline that identifes all the devices in that trace with over 99% accuracy [30, 32]. However, it has also been observed , that such techniques are brittle when no labels are available. More perniciously, they provide false confdence scores about the label they do ascribe to a sample. This paper introduces a probabilistic framework for providing meaningful feedback in device identifcation, particularly when the device has not been previously observed. In our work, we use stacked autoencoders for automatically learning features from device trafc, learn the classes of trafc observed, and probabilistically model each device as a distribution of trafc classes. Our experiments show that we are able to identify previously seen devices after only 18.9 TCP-flow samples with 100% accuracy for devices where at least 50 samples are observed. We also show that we can distinguish between two broad classes of devices - IoT and Non-IoT - by examining the average number of flow classes observed over a set of samples. Our experiments show that we can infer the correct class of unseen devices with an over 82% average F1 score and 70% accuracy.