Deep Reinforcement Learning for Personalized Antidepressant Decision Support in Bipolar Spectrum Disorders: Simulated Randomized Trial Framework

doi:10.4236/oalib.1115137

OALib Journal期刊
ISSN: 2333-9721
费用：99美元

查看量	下载量

Open Access Library Journal 13 2026

查看所有领域

Deep Reinforcement Learning for Personalized Antidepressant Decision Support in Bipolar Spectrum Disorders: Simulated Randomized Trial Framework

DOI: 10.4236/oalib.1115137, PP. 1-19

Rocco de Filippis, Abdullah Al Foysal

Subject Areas: Artificial Intelligence, Psychiatry & Psychology

Keywords: Bipolar Disorder, Reinforcement Learning, Precision Psychiatry, Treatment Optimization, Causal Inference, Machine Learning, Antidepressant, Mood Destabilization, Deep Learning, Personalized Medicine

Full-Text Cite this paper Add to My Lib

Abstract

Treatment selection for bipolar depression remains largely trial-and-error, with substantial non-response to first-line strategies and clinically meaningful risk of mood destabilization. We developed a deep reinforcement learning (RL) framework to optimize treatment selection while explicitly penalizing destabilization events. We implemented RL-CADENCE, a simulated multi-centre experimental framework designed to emulate a parallel-group randomized trial across 12 virtual psychiatric centres. Using a combination of publicly available online data sources and clinically informed synthetic generation, we constructed a cohort of 2500 virtual participants representing bipolar spectrum disorders. Virtual participants were algorithmically allocated (3:3:3:1) to four treatment strategies: 1) lithium + SSRI, 2) quetiapine + lamotrigine, 3) lurasidone + mood stabilizer (lithium or valproate), or 4) RL-personalized treatment selection. The primary endpoint was the simulated change in Montgomery ?sberg Depression Rating Scale (MADRS) score over 12 months. Secondary outcomes included response, mood destabilization events, and quality-adjusted life years (QALYs). A causal machine learning pipeline estimated conditional average treatment effects (CATE) to characterize heterogeneity across subgroups within the synthetic cohort. In simulation, the RL-personalized strategy achieved greater MADRS improvement than pooled standard protocols (mean difference: ?5.6 points; 95% CI: ?7.6 to ?3.6; Cohen’s d = 0.78). Simulated response rates (≥50% MADRS reduction) were 95.7% versus 58.9%, and mood destabilization occurred in 4.8% versus 10.8% of synthetic patient-months. The RL policy network achieved an AUC-ROC of 0.89 for predicting the optimal treatment strategy under the simulated counterfactual evaluation. Heterogeneous effects were largest in mixed features (CATE = 15.2; 95% CI: 8.9 - 21.5) and bipolar I subtype (CATE = 12.3; 95% CI: 7.1 - 17.5). Within a simulated, synthetic-data evaluation, deep RL showed strong potential to personalize antidepressant-related treatment selection in bipolar spectrum disorders, improving depressive symptom outcomes while reducing destabilization risk. These findings provide proof-of-concept for RL-based precision psychiatry and motivate prospective validation in real-world clinical cohorts.Subject AreasPsychiatry & Psychology

Cite this paper

Filippis, R. D. and Foysal, A. A. (2026). Deep Reinforcement Learning for Personalized Antidepressant Decision Support in Bipolar Spectrum Disorders: Simulated Randomized Trial Framework. Open Access Library Journal, 13, e15137. doi: http://dx.doi.org/10.4236/oalib.1115137.

References

[1]	Merikangas, K.R., Jin, R., He, J., Kessler, R.C., Lee, S., Sampson, N.A., <i>et al</i>. (2011) Prevalence and Correlates of Bipolar Spectrum Disorder in the World Mental Health Survey Initiative. <i>Archives</i> <i>of</i> <i>General</i> <i>Psychiatry</i>, 68, 241-251. <br>https://doi.org/10.1001/archgenpsychiatry.2011.12
[2]	Vos, T., Lim, S.S., Abbafati, C., Abbas, K.M., Abbasi, M., Abbasifard, M., <i>et al</i>. (2020) Global Burden of 369 Diseases and Injuries in 204 Countries and Territories, 1990-2019: A Systematic Analysis for the Global Burden of Disease Study 2019. <i>The</i> <i>Lancet</i>, 396, 1204-1222. <br>https://doi.org/10.1016/s0140-6736(20)30925-9
[3]	Sidor, M.M. and MacQueen, G.M. (2011) Antidepressants for the Acute Treatment of Bipolar Depression: A Systematic Review and Meta-Analysis. <i>The</i> <i>Journal</i> <i>of</i> <i>Clin</i><i>ical</i> <i>Psychiatry</i>, 72, 156-167. <br>https://doi.org/10.4088/jcp.09r05385gre
[4]	Pacchiarotti, I., Bond, D.J., Baldessarini, R.J., Nolen, W.A., Grunze, H., Licht, R.W., <i>et al</i>. (2013) The International Society for Bipolar Disorders (ISBD) Task Force Report on Antidepressant Use in Bipolar Disorders. <i>American</i> <i>Journal</i> <i>of</i> <i>Psychiatry</i>, 170, 1249-1262. <br>https://doi.org/10.1176/appi.ajp.2013.13020185
[5]	Phillips, M.L. and Kupfer, D.J. (2013) Bipolar Disorder Diagnosis: Challenges and Future Directions. <i>The Lancet</i>, 381, 1663-1671. <br>https://doi.org/10.1016/s0140-6736(13)60989-7
[6]	Goodwin, G., Haddad, P., Ferrier, I., Aronson, J., Barnes, T., Cipriani, A., <i>et al</i>. (2016) Evidence-Based Guidelines for Treating Bipolar Disorder: Revised Third Edition Recommendations from the British Association for Psychopharmacology. <i>Journal</i> <i>of</i> <i>Psychopharmacology</i>, 30, 495-553. <br>https://doi.org/10.1177/0269881116636545
[7]	Topol, E.J. (2019) High-Performance Medicine: The Convergence of Human and Artificial Intelligence. <i>Nature</i> <i>Medicine</i>, 25, 44-56. <br>https://doi.org/10.1038/s41591-018-0300-7
[8]	Rajpurkar, P., Chen, E., Banerjee, O. and Topol, E.J. (2022) AI in Health and Medicine. <i>Nature</i> <i>Medicine</i>, 28, 31-38. <br>https://doi.org/10.1038/s41591-021-01614-0
[9]	Chekroud, A.M., Zotti, R.J., Shehzad, Z., Gueorguieva, R., Johnson, M.K., Trivedi, M.H., <i>et al</i>. (2016) Cross-Trial Prediction of Treatment Outcome in Depression: A Machine Learning Approach. <i>The</i> <i>Lancet</i> <i>Psychiatry</i>, 3, 243-250. <br>https://doi.org/10.1016/s2215-0366(15)00471-x
[10]	Kessler, R.C., Warner, C.H., Ivany, C., Petukhova, M.V., Rose, S., Bromet, E.J., <i>et al</i>. (2015) Predicting Suicides after Psychiatric Hospitalization in US Army Soldiers: The Army Study to Assess Risk and Resilience in Servicemembers (Army STARRS). <i>JAMA</i> <i>Psychiatry</i>, 72, 49-57. <br>https://doi.org/10.1001/jamapsychiatry.2014.1754
[11]	Chen, J.H. and Asch, S.M. (2017) Machine Learning and Prediction in Medicine—Beyond the Peak of Inflated Expectations. <i>New</i> <i>England</i> <i>Journal</i> <i>of</i> <i>Medicine</i>, 376, 2507-2509. <br>https://doi.org/10.1056/nejmp1702071
[12]	Sendak, M., Gao, M., Nichols, C., <i>et al</i>. (2020) “Human-Compatible” Machine Learning as a Step toward Safe Clinical AI. <i>NPJ Digital Medicine</i>, 3, Article No. 141.
[13]	Sutton, R.S. and Barto, A.G. (2018) Reinforcement Learning: An Introduction. 2nd Edition, MIT Press.
[14]	Gottesman, O., Johansson, F., Komorowski, M., Faisal, A., Sontag, D., Doshi-Velez, F., <i>et al</i>. (2019) Guidelines for Reinforcement Learning in Healthcare. <i>Nature</i> <i>Medicine</i>, 25, 16-18. <br>https://doi.org/10.1038/s41591-018-0310-5
[15]	Yu, C., Liu, J., Nemati, S. and Yin, G. (2021) Reinforcement Learning in Healthcare: A Survey. <i>ACM Computing Surveys</i>, 55, 1-36.
[16]	Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., <i>et al</i>. (2016) Mastering the Game of Go with Deep Neural Networks and Tree Search. <i>Nature</i>, 529, 484-489. <br>https://doi.org/10.1038/nature16961
[17]	Vinyals, O., Babuschkin, I., Czarnecki, W.M., Mathieu, M., Dudzik, A., Chung, J., <i>et al</i>. (2019) Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning. <i>Nature</i>, 575, 350-354. <br>https://doi.org/10.1038/s41586-019-1724-z
[18]	Komorowski, M., Celi, L.A., Badawi, O., Gordon, A.C. and Faisal, A.A. (2018) The Artificial Intelligence Clinician Learns Optimal Treatment Strategies for Sepsis in Intensive Care. <i>Nature</i> <i>Medicine</i>, 24, 1716-1720. <br>https://doi.org/10.1038/s41591-018-0213-5
[19]	Peng, X., Ding, Y., Wirsching, W., <i>et al</i>. (2018) Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning. <i>AMIA Annual Symposium Proceedings</i>, San Francisco, 3-7 November 2018, 887-896.
[20]	Zhao, R., Pacella, M., Sanmugarajah, J., <i>et al</i>. (2022) Deep Reinforcement Learning for Treatment Duration Decision Making in Acute Lymphoblastic Leukemia. <i>IEEE Journal of Biomedical and Health Informatics</i>, 26, 4623-4634.
[21]	Colombo, F., Calesella, F., Mazza, M.G., Melloni, E.M.T., Morelli, M.J., Scotti, G.M., <i>et al</i>. (2022) Machine Learning Approaches for Prediction of Bipolar Disorder Based on Biological, Clinical and Neuropsychological Markers: A Systematic Review and Meta-Analysis. <i>Neuroscience & Biobehavioral Reviews</i>, 135, Article 104552. <br>https://doi.org/10.1016/j.neubiorev.2022.104552
[22]	He, M., Bakker, E.M. and Lew, M.S. (2024) DPD (Depression Detection) Net: A Deep Neural Network for Multimodal Depression Detection. <i>Health Information Science and Systems</i>, 12, Article No. 53. <br>https://doi.org/10.1007/s13755-024-00311-9
[23]	First, M.B., Williams, J.B.W., Karg, R.S. and Spitzer, R.L. (2015) Structured Clinical Interview for DSM-5 Research Version (SCID-5 for DSM-5, Research Version; SCID-5-RV). American Psychiatric Association.
[24]	Montgomery, S.A. and Åsberg, M. (1979) A New Depression Scale Designed to Be Sensitive to Change. <i>British Journal of Psychiatry</i>, 134, 382-389. <br>https://doi.org/10.1192/bjp.134.4.382
[25]	Schulman, J., Wolski, F., Dhariwal, P., Radford, A. and Klimov, O. (2017) Proximal Policy Optimization Algorithms.
[26]	Thomas, P., Theocharous, G. and Ghavamzadeh, M. (2015) High-Confidence Off-Policy Evaluation. <i>Proceedings of the AAAI Conference on Artificial Intelligence</i>, 29, 3000-3006. <br>https://doi.org/10.1609/aaai.v29i1.9541
[27]	Schulman, J., Levine, S., Abbeel, P., Jordan, M. and Moritz, P. (2015) Trust Region Policy Optimization. <i>International Conference on Machine Learning</i>, Lille, 7-9 July 2015, 1889-1897.
[28]	Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., <i>et al</i>. (2018) Double/Debiased Machine Learning for Treatment and Structural Parameters. <i>The Econometrics Journal</i>, 21, C1-C68. <br>https://doi.org/10.1111/ectj.12097
[29]	Lundberg, S.M. and Lee, S.I. (2017) A Unified Approach to Interpreting Model Predictions. <i>Advances in Neural Information Processing Systems </i>30:<i> Annual Conference on Neural Information Processing Systems</i> 2017, Long Beach, 4-9 December 2017, 4765-4774.
[30]	Vickers, A.J. and Elkin, E.B. (2006) Decision Curve Analysis: A Novel Method for Evaluating Prediction Models. <i>Medical Decision Making</i>, 26, 565-574. <br>https://doi.org/10.1177/0272989x06295361
[31]	Cuijpers, P., Turner, E.H., Mohr, D.C., Hofmann, S.G., Andersson, G., Berking, M., <i>et al</i>. (2014) Comparison of Psychotherapies for Adult Depression to Pill Placebo Control Groups: A Meta-Analysis. <i>Psychological Medicine</i>, 44, 685-695. <br>https://doi.org/10.1017/s0033291713000457
[32]	Geddes, J.R., Gardiner, A., Rendell, J., Voysey, M., Tunbridge, E., Hinds, A., <i>et al</i>. (2022) Comparative Evaluation of Quetiapine plus Lamotrigine Combination versus Quetiapine Monotherapy in Bipolar Depression: A Randomized, Double-Blind, Placebo-Controlled Trial. <i>The </i><i>Lancet Psychiatry</i>, 9, 883-894.
[33]	McIntyre, R.S., Berk, M., Brietzke, E., Goldstein, B.I., López-Jaramillo, C., Kessing, L.V., <i>et al</i>. (2020) Bipolar Disorders. <i>The Lancet</i>, 396, 1841-1856. <br>https://doi.org/10.1016/s0140-6736(20)31544-0
[34]	Wray, N.R., Ripke, S., Mattheisen, M., Trzaskowski, M., Byrne, E.M., Abdellaoui, A., <i>et al</i>. (2018) Genome-Wide Association Analyses Identify 44 Risk Variants and Refine the Genetic Architecture of Major Depression. <i>Nature Genetics</i>, 50, 668-681. <br>https://doi.org/10.1038/s41588-018-0090-3
[35]	Zeng, J., Zhang, Y., Xiang, Y., Liang, S., Xue, C., Zhang, J., <i>et al</i>. (2023) Optimizing Multi-Domain Hematologic Biomarkers and Clinical Features for the Differential Diagnosis of Unipolar Depression and Bipolar Depression. <i>NPJ Mental Health Research</i>, 2, Article No. 4. <br>https://doi.org/10.1038/s44184-023-00024-z
[36]	Kanchapogu, N.R. and Mohanty, S.N. (2025) Deep Learning with Ensemble-Based Hybrid AI Model for Bipolar and Unipolar Depression Detection Using Demographic and Behavioral Based on Time-Series Data. <i>Dialogues</i> <i>in</i> <i>Clinical</i> <i>Neuroscience</i>, 27, 16-35. <br>https://doi.org/10.1080/19585969.2025.2524337
[37]	Kessler, R.C., Bossarte, R.M., Luedtke, A., <i>et al</i>. (2023) Evaluation of a Machine Learning-Based Prediction Model for Benefit and Harm from Antidepressant Treatment in the EM-BARC Randomized Clinical Trial. <i>JAMA Network Open</i>, 6, e2327755.
[38]	Iniesta, R., Hodgson, K., Stahl, D., Malki, K., Maier, W., Rietschel, M., <i>et al</i>. (2018) Antidepressant Drug-Specific Prediction of Depression Treatment Outcomes from Genetic and Clinical Variables. <i>Scientific</i> <i>Reports</i>, 8, Article No. 5380. <br>https://doi.org/10.1038/s41598-018-23584-z
[39]	Henry, K.E., Hager, D.N., Pronovost, P.J. and Saria, S. (2015) A Targeted Real-Time Early Warning Score (TREWScore) for Septic Shock. <i>Science</i> <i>Translational</i> <i>Medicine</i>, 7, 299ra122. <br>https://doi.org/10.1126/scitranslmed.aab3719
[40]	Tomašev, N., Glorot, X., Rae, J.W., Zielinski, M., Askham, H., Saraiva, A., <i>et al</i>. (2019) A Clinically Applicable Approach to Continuous Prediction of Future Acute Kidney Injury. <i>Nature</i>, 572, 116-119. <br>https://doi.org/10.1038/s41586-019-1390-1
[41]	Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J. and Mané, D. (2016) Concrete Problems in AI Safety. <br>https://doi.org/10.48550/arXiv.1606.06565
[42]	Knevel, R. and Liao, K.P. (2023) From Real-World Electronic Health Record Data to Real-World Results Using Artificial Intelligence. <i>Annals of the Rheumatic Diseases</i>, 82, 306-311. <br>https://doi.org/10.1136/ard-2022-222626
[43]	Kosorok, M.R. and Laber, E.B. (2019) Precision Medicine. <i>Annual Review of Statistics and Its Application</i>, 6, 263-286. <br>https://doi.org/10.1146/annurev-statistics-030718-105251
[44]	Rieke, N., Hancox, J., Li, W., Milletarì, F., Roth, H.R., Albarqouni, S., <i>et al</i>. (2020) The Future of Digital Health with Federated Learning. <i>NPJ</i> <i>Digital</i> <i>Medicine</i>, 3, Article No. 119. <br>https://doi.org/10.1038/s41746-020-00323-1
[45]	Ghassemi, M., Oakden-Rayner, L. and Beam, A.L. (2021) The False Hope of Current Approaches to Explainable Artificial Intelligence in Health Care. <i>The</i> <i>Lancet</i> <i>Digital</i> <i>Health</i>, 3, e745-e750. <br>https://doi.org/10.1016/s2589-7500(21)00208-9

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133