The transport literature is dense regarding short-term traffic predictions, up to the scale of 1 hour, yet less dense for long-term traffic predictions. The transport literature is also sparse when it comes to city-scale traffic predictions, mainly because of low data availability. In this work, we report an effort to investigate whether deep learning models can be useful for the long-term large-scale traffic prediction task, while focusing on the scalability of the models. We investigate a city-scale traffic dataset with 14 weeks of speed observations collected every 15 minutes over 1098 segments in the hypercenter of Los Angeles, California. We look at a variety of state-of-the-art machine learning and deep learning predictors for link-based predictions, and investigate how such predictors can scale up to larger areas with clustering, and graph convolutional approaches. We discuss that modelling temporal and spatial features into deep learning predictors can be helpful for long-term predictions, while simpler, not deep learning-based predictors, achieve very satisfactory performance for link-based and short-term forecasting. The trade-off is discussed not only in terms of prediction accuracy vs prediction horizon but also in terms of training time and model sizing.