Precise Segmentation of U.S. Adults from 24-Hour Wearable-based Physical Activity Profiles Using Machine Learning Clustering

Image credit: Jinjoo Shim

Abstract

Growing evidence shows that wearable-based physical activity data (e.g., accelerometer) could be used as digital biomarkers to identify individuals at higher risk for chronic diseases and to monitor the effectiveness of interventions. Physical activity classification based on pre-established metrics misclassifies 20-40% of the time, and thus, underscores the need for a more precise and accurate classification method. To this end, we investigated different machine learning algorithms to improve the clustering of 24-hour wearable-based physical activity profiles of the U.S. population in 2011-2014 NHANES. Hierarchical clustering identified 46% of U.S. adults with “High Activity/Robust Pattern” (Cluster 1) and 54% with “Low Activity/Weak Pattern” (Cluster 2). Cluster 1 shows elevated amplitude, high stability, and a good circadian rhythm alignment. In contrast, Cluster 2 exhibits dampened amplitude, fragmented rhythm, and substantially high nocturnal activity. Cluster 2 was also associated with older age, males, and longer sedentary hours (all p<0.05). The data-driven clustering can improve population segmentation and digital phenotyping of the general population.

Publication
In 2023 IEEE 11th International Conference on Healthcare Informatics (ICHI)
Jinjoo Shim
Jinjoo Shim
Digital Health Data Scientist

My research interests is to advance digital healthcare through AI/ML and data science.