An integrated framework for mining temporal logs from fluctuating events
The importance of mining time lags of hidden temporal dependencies from sequential data is highlighted in many domains including system management, stock market analysis, climate monitoring, and more. Mining time lags of temporal dependencies provides useful insights into the understanding of sequential data and predicting its evolving trend. Traditional methods mainly utilize the predefined time window to analyze the sequential items, or employ statistical techniques to identify the temporal dependencies from a sequential data. However, it is a challenging task for existing methods to find the time lag of temporal dependencies in the real world, where time lags are fluctuating, noisy, and interleaved with each other. In order to identify temporal dependencies with time lags in this setting, this paper comes up with an integrated framework from both system and algorithm perspectives. Specifically, a novel parametric model is introduced to model the noisy time lags for temporal dependencies discovery between events. Based on the parametric model, an efficient expectation maximization approach is proposed for time lag discovery with maximum likelihood. Furthermore, this paper also contributes an approximation method for learning time lag to improve the scalability in terms of the number of events, without incurring significant loss of accuracy.