Adverse drug reaction (ADR) is a major burden for patients and healthcare industry. It usually causes preventable hospitalizations and deaths, while associated with a huge amount of cost. Traditional preclinical in vitro safety profiling and clinical safety trials are restricted in terms of small scale, long duration, huge financial costs and limited statistical significance. The availability of large amounts of drug and ADR data potentially allows ADR predictions during the drugs' early preclinical stage with data analytics methods to inform more targeted clinical safety tests. Despite their initial success, existing methods have trade-offs among interpretability, predictive power and efficiency. This urges us to explore methods that could have all these strengths and provide practical solutions for real world ADR predictions. We cast the ADR-drug relation structure into a three-layer hierarchical Bayesian model. We interpret each ADR as a symbolic word and apply latent Dirichlet allocation (LDA) to learn topics that may represent certain biochemical mechanism that relates ADRs with drug structures. Based on LDA, we designed an equivalent regularization term to incorporate the hierarchical ADR domain knowledge. Finally, we developed a mixed input model leveraging a fast collapsed Gibbs sampling method that the complexity of each iteration of Gibbs sampling proportional only to the number of positive ADRs. Experiments on real world data show our models achieved higher prediction accuracy and shorter running time than the state-of-the-art alternatives.