Early fault detection technique is crucial to reduce the machine downtime and has high impact on a wide variety of industrial applications. However, early fault detection is still subject to the following challenges: 1) extracting features from incipient fault signals; 2) detecting anomalies with considering sequential data correlation; and 3) enhancing the reliability of fault alarm. In this paper, we introduce a novel deep-structured framework to solve the early fault detection problem. First, the system variation is measured with the deviation value generated by a current feature extraction model using deep neural network (DNN) and a distribution estimator based on the long short-term memory (LSTM) network. DNN has the ability of representing a complicated and intrinsic distribution for data, which is suitable for handling the early fault data masked by heavy noise, and LSTM is able to discover temporal dependencies in high-dimensional sequential data, which allows distribution estimator making use of previous context information as well as makes the distribution estimator more robust to warp along the time axis. Second, a circular indirect alarm assessment strategy is designed for collecting deviation values and confirming the fault appearance only when a specified confidence level is reached. Experimental results on the typical real-world bearing data sets demonstrate the effectiveness and the reliability of our model.