An increasing number of people use mobile devices to monitor their behavior, such as exercise, and record their health status, such as psychological stress. However, these devices rarely provide ongoing support to help users understand how their behavior contributes to changes in their health status. To address this challenge, we aim to develop an interpretable policy for physical activity recommendations that reduce a user's perceived psychological stress, over a given time horizon. We formulate this problem as a sequential decision-making problem and solve it using a new method that we refer to as threshold Q-learning (TQL). The advantage of the TQL method over traditional Q-learning is that it is 'doubly robust' and interpretable. This interpretability is achieved by making model assumptions and incorporating threshold selection into the learning process. Our simulation results indicate that the TQL method performs better than the Q-learning method given model misspecification. Our analyses are performed on data collected from 79 healthy adults over a 7 week period, where the data comprise physical activity patterns collected from mobile devices and self-assessed stress levels of the users. This work serves as a first step toward a computational health coaching solution for mobile device users.