In this paper, we propose a novel algorithm by jointly modeling motion and context information targeting at detecting abnormal events in crowded scenes. In our algorithm, context pattern information, extracted through volume local binary patterns computation on three orthogonal planes (LBP-TOP) between local target areas with surrounding areas, is explicitly taken into consideration for localizing abnormality. To capture motion information, a novel feature descriptor named Multi-scale Histogram of Frequency Coefficient is explored by taking Fourier Transform on the extracted dense trajectories. For detection of abnormality, sparse reconstruction cost from a learned event dictionary is adopted to classify local normal and abnormal events. Experiments conducted on three benchmark datasets show superior performance to many related state-of-the-art methods.