There is no doubt that making AI mainstream by bringing powerful, yet power hungry deep neural networks (DNNs) to resource-constrained devices would required an efficient co-design of algorithms, hardware and software. The increased popularity of DNN applications deployed on a wide variety of platforms, from tiny microcontrollers to data-centers, have resulted in multiple questions and challenges related to constraints introduced by the hardware. In this survey on hardware-aware neural architecture search (HW-NAS), we present some of the existing answers proposed in the literature for the following questions: "Is it possible to build an efficient DL model that meets the latency and energy constraints of tiny edge devices?", "How can we reduce the trade-off between the accuracy of a DL model and its ability to be deployed in a variety of platforms?". The survey provides a new taxonomy of HW-NAS and assesses the hardware cost estimation strategies. We also highlight the challenges and limitations of existing approaches and potential future directions. We hope that this survey will help to fuel the research towards efficient deep learning.