Real-time detection of verbal protest (sensory overload-induced crying) in children with autism is a first step towards understanding the precursors of challenging behaviors associated with autism. Detection of verbal protest is useful for both autism researchers interested in exploring just-in-time intervention techniques and researchers interested in audio event detection in routine living environments. In this paper, we examine, adapt, and improve upon two techniques for verbal protest recognition and tailor them for children with autism spectrum disorder (ASD). The first technique investigated is a Gaussian Mixture Model (GMM) with stacking. The second technique uses Convolutional Neural Networks (CNN) trained on log Mel-filter banks (LMFB). We proceed to examine accuracy with a focus on real-world false positive rates and minimization of dataset biases through the introduction of noise and input perturbation.