E-health systems consist of intelligent devices, medical institutions, edge nodes, and cloud servers to improve healthcare service quality and efficiency. In e-health systems, patients' data are cooperatively collected by their wearable devices and the hospital they have visited, i.e., vertically distributed data. The data on wearable devices share the same feature set but are different in sample spaces, i.e., horizontally partitioned data. Meanwhile, hospitals target various user groups resulting in high data diversity, i.e., non-identically distributed data. These three characteristics cause that existing federated learning frameworks cannot efficiently train models on medical data. Furthermore, model training in e-health is time-sensitive because some diseases mutate very quickly and spread easily, which requires fast convergence of machine learning algorithms. In this paper, we address the problem of how to efficiently and rapidly train global models on e-health data. Specifically, we propose a multilayer federated learning framework to cope with data that are vertically, horizontally, and non-identically distributed. Moreover, we develop a Multi-Layer Stochastic Gradient Descent (MLSGD) algorithm towards the proposed framework to learn the optimal global model. To improve training efficiency, partial models learned by devices are aggregated on edge nodes before exchanging intermediate results with hospitals. The weight of local models is proportional to local data size when performing global aggregation to balance the impact of local models on the global model. We also prove the convergence of the MLSGD algorithm from a theoretical perspective. The experimental results from the real-world dataset MIMIC-III validate that the proposed algorithm converges fast and achieves desired accuracy.