This paper focuses on feature combination approaches for discriminative language models (DLMs). DLM is a feature-based log-linear language modeling approach where the feature parameters are estimated discriminatively. DLM allows for easy integration of various knowledge sources into language modeling. Choosing the proper strategy when combining features coming from different information sources is important. We investigated three approaches for combining lexical, word class, and acoustic features in DLMs. The three approaches are joint parameter estimation, cascade training, and model score combination. The cascade approach is an interesting approach that finally gave the best test set performance, improving the word error rate by 0.49% absolute (3% relative) on transcription of English Broadcast News. The word class features and state duration features were found to be very complementary, and their combination provided most of the improvement. Copyright © 2011 ISCA.