Unsupervised outlier detection is a vital task and has high impact on a wide variety of applications domains, such as image analysis and video surveillance. It also gains long-standing attentions and has been extensively studied in multiple research areas. Detecting and taking action on outliers as quickly as possible are imperative in order to protect network and related stakeholders or to maintain the reliability of critical systems. However, outlier detection is difficult due to the one class nature and challenges in feature construction. Sequential anomaly detection is even harder with more challenges from temporal correlation in data, as well as the presence of noise and high dimensionality. In this paper, we introduce a novel deep structured framework to solve the challenging sequential outlier detection problem. We use autoencoder models to capture the intrinsic difference between outliers and normal instances and integrate the models to recurrent neural networks that allow the learning to make use of previous context as well as make the learners more robust to warp along the time axis. Furthermore, we propose to use a layerwise training procedure, which significantly simplifies the training procedure and hence helps achieve efficient and scalable training. In addition, we investigate a fine-tuning step to update all parameters set by incorporating the temporal correlation in the sequence. We further apply our proposed models to conduct systematic experiments on five real-world benchmark data sets. Experimental results demonstrate the effectiveness of our model, compared with other state-of-the-art approaches.