We study fundamental problems in learning nonstationary time-series: how to effectively regularize time-series models and how to adaptively tune forgetting rates. The effectiveness of L2 regularization depends on the choice of coordinates, and the variables need to be appropriately normalized. In nonstationary environment, however, what is appropriate can vary over time. Proposed regularization is invariant to the invertible linear transformation of coordinates, eliminating the necessity of normalization. We also propose an ensemble learning approach to adaptively tuning the forgetting rate and regularization-coefficient. We train multiple models with varying hyperparameters and evaluate their performance by the use of multiple hyper forgetting rates. At each step, we choose the best performing model on the basis of the best performing hyper forgetting rate. The effectiveness of the proposed approaches is demonstrated with real time-series.