About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IBM J. Res. Dev
Paper
Data quality challenges for person-generated health and wellness data
Abstract
Person-generated health data (PGHD) generated by wearable devices and smartphone applications are growing rapidly. There is increasing effort to employ advanced analytical methods to generate insights from these data in order to help people change their lifestyle and improve their health. PGHD - such as step counts, exercise logs, nutritional diaries, and sleep records - are often incomplete, inaccurate, and collected over too short a duration. Insufficient user engagement with wearable and mobile technologies, as well as lack of sensor validation, standardization of data collection, transparency of data processing assumptions, and accessibility to relevant data from consumer-grade sensors, also negatively affects data quality. The literature on data quality for PGHD is sparse and fragmented, providing little guidance to data analysts on how to assess and prioritize data quality concerns. In this paper, we summarize our experiences as data analysts working with PGHD, outline some of the challenges in using PGHD for insight generation, and discuss some established methods for addressing these challenges. We review the literature on PGHD data quality, identify the major stakeholders in the PGHD ecosystem, and apply an established data quality framework to present the most relevant data quality challenges for each stakeholder.