Deep learning models are frequently used to capture relations between inputs and outputs and to predict operation costs in dynamical systems. Computing optimal control policies based on the resulting regression models, however, is a challenging task because of the nonlinearity and nonconvexity of the models. We propose a linearizable approach to design optimal control policies based on deep learning models for handling both continuous and discrete action spaces. When using ReLU activation functions, one can construct an equivalent representation of recurrent neural networks by a set of mixed-integer linear constraints. The optimal control problem reduces to a mixed-integer linear program (MILP), which can be solved using MILP solvers. Numerical experiments on standard reinforcement learning benchmarks show the good performance of the proposed approach.