A graph Laplacian prior for Bayesian variable selection and grouping
Variable selection, or subset selection, plays a fundamental role in modern statistical modeling. In many applications, interactions exist between the selected variables. Statistical modeling of such dependence structure is of great importance. In this paper, the focus is on cases in which some correlated predictors have similar effects on the response, and will be grouped into predictive clusters. Here a graph Laplacian prior (GL-prior) is introduced within the Bayesian framework, the Maximum A Posterior (MAP) estimate which simultaneously allows for variable selection, coefficient estimation and predictive group identification. The connections between the GL-prior (graph Laplacian) and the existing regularized regression methods are established accordingly. For computation, an EM based algorithm is proposed, where an efficient augmented Lagrangian approach is utilized for the maximization step. The performance of the proposed approach is examined through simulation studies, followed by a microarray data analysis concerning the plant Arabidopsis thaliana.