%0 Journal Article
%T A Lightweight Self-Supervised Representation Learning Framework for Depression Risk Profiling from Synthetic Daily Behavioural Trajectories
%A Rocco de Filippis
%A Abdullah Al Foysal
%J Open Access Library Journal
%V 13
%N 3
%P 1-20
%@ 2333-9721
%D 2026
%I Open Access Library
%R 10.4236/oalib.1114918
%X Detecting behavioural signatures of depression from everyday digital traces is a central challenge in computational psychiatry. Real-world datasets from smartphones and wearables often suffer from sparse labels, heterogeneous sampling, and highly imbalanced case control ratios, limiting the development of robust models. To explore these challenges under controlled conditions, we construct a clinically inspired synthetic dataset of daily behavioural trajectories for 200 virtual subjects monitored over 30 days. For each subject day, we simulate multivariate digital phenotyping features including sleep duration, physical activity, social interactions, and diurnal phone usage. Subject-level depression labels are defined via PHQ-9 score distributions aligned with standard clinical thresholds. We then evaluate whether a light-weight self-supervised learning (SSL) encoder can derive latent representations that differentiate depressed from healthy subjects more effectively than na&#239;ve raw features. The SSL model is trained using a contrastive NT-Xent objective combined with a reconstruction term and operates on full 30-day sequences. The resulting embeddings are fed into multiple downstream classifiers (Random Forest, XGBoost, SVM, and Logistic Regression). Across all models, SSL features consistently outperform raw handcrafted aggregates in AUC, with clear improvements in discriminability and calibration. Behavioural distributions, temporal trajectories and correlations, classifier performance and ROC curves, weekly rhythms, cluster-level archetypes, and UMAP projections of the latent space jointly show that depression is expressed not as simple magnitude shifts in single features but as distributed, temporally structured deviations. This work contributes a mathematically explicit synthetic benchmark and demonstrates that compact SSL encoders can learn clinically meaningful representations of mental-health–related behaviour even in noisy, imbalanced settings, providing a foundation for future real-world digital phenotyping pipelines.<br />
%K Digital Phenotyping
%K Self-Supervised Learning
%K Depression Detection
%K Synthetic Behavioural Data
%K Temporal Representation Learning
%K Lightweight Deep Models
%U http://www.oalib.com/paper/6888123