We present an integrated optimization approach to parameter estimation for discrete choice demand models where data for one or more choice alternatives are censored. We employ a mixed-integer program (MIP) to jointly determine the prediction parameters associated with the customer arrival rate and their substitutive choices. This integrated approach enables us to recover proven, (near-) optimal parameter values with respect to the chosen loss-minimization (LM) objective function, thereby overcoming a limitation of prior multistart heuristic approaches that terminate without providing precise information on the solution quality. The imputations are done endogenously in the MIP by estimating optimal values for the probabilities of the unobserved choices being selected. Under mild assumptions, we prove that the approach is asymptotically consistent. For large LM instances, we derive a nonconvex-contvex alternating heuristic that can be used to obtain quick, near-optimal solutions. Partial information, user acceptance criteria, model selection, and regularization techniques can be incorporated to enhance practical efficacy. We test the LM model on simulated and real data and present results for a variety of demand-prediction scenarios: single-item, multi-item, time-varying arrival rate, large-scale instances, and a dual-layer estimation model extension that learns the unobserved market shares of competitors.