Data-level Linkage of Multiple Surveys for Improved Understanding of Global Health Challenges
Data-driven approaches can provide more enhanced insights for domain experts in addressing critical global health challenges, such as newborn and child health, using surveys (e.g., Demographic Health Survey). Though there are multiple surveys on the topic, data-driven insight extraction and analysis are often applied on these surveys separately, with limited efforts to exploit them jointly, and hence results in poor prediction performance of critical events, such as neonatal death. Existing machine learning approaches to utilise multiple data sources are not directly applicable to surveys that are disjoint on collection time and locations. In this paper, we propose, to the best of our knowledge, the first detailed work that automatically links multiple surveys for the improved predictive performance of newborn and child mortality and achieves cross-study impact analysis of covariates.