The world has seen in recent years great successes in applying deep learning (DL) for many application domains. Though powerful, DL is not easy to be used well. In this invited paper, we study an urban taxi demand forecast problem using DL, and we show a number of key insights in modeling a domain problem as a suitable DL task. We also conduct a systematic comparison of two recent deep neural networks (DNNs) for taxi demand prediction, i.s., the ST-ResNet and FLC-Net, on New York city taxi record dataset. Our experimental results show DNNs indeed outperform most traditional machine learning techniques, but such superior results can only be achieved with proper design of the right DNN architecture, where domain knowledge plays a key role.