%0 Journal Article %T Enhanced Multimodal Transformer for Treatment-Resistant Depression Prediction Using Synthetic fMRI, Genomic, and Clinical Data %A Rocco de Filippis %A Abdullah Al Foysal %J Open Access Library Journal %V 13 %N 1 %P 1-16 %@ 2333-9721 %D 2026 %I Open Access Library %R 10.4236/oalib.1114447 %X Treatment-Resistant Depression (TRD) remains one of the most challenging subtypes of major depressive disorder, affecting approximately one-third of patients and leading to significant morbidity, healthcare costs, and reduced quality of life. Predicting TRD onset and progression is complex, as it requires integrating heterogeneous biomarkers spanning neuroimaging, genomics, and clinical history. This study presents an Enhanced Multimodal Transformer (EMT) designed to fuse functional magnetic resonance imaging (fMRI), single-nucleotide polymorphism (SNP) profiles, and structured clinical variables into a unified predictive framework. The architecture employs modality-specific encoders patch-based embeddings with positional encodings for fMRI, attention-weighted embeddings for SNP data, and normalized dense projections for clinical features-followed by the introduction of modality tokens to enable cross-modal information exchange within a Transformer encoder. To validate the architecture in a controlled setting, we generated a balanced, clinically inspired synthetic dataset with distinct activation patterns in brain regions (prefrontal cortex, amygdala, anterior cingulate), SNP distributions with predictive loci, and clinically relevant severity profiles. Model training achieved rapid convergence with early stopping at eight epochs. Evaluation demonstrated a perfect Area Under the ROC Curve (AUC = 1.00) and average precision of 1.00, indicating complete separation in probability space. However, accuracy at a fixed 0.5 decision threshold was limited (50%), reflecting probability compression into distinct but narrow ranges (¡Ö0.32 for non-TRD, ¡Ö0.41 for TRD). Feature analysis revealed that clinical severity, specific SNP clusters, and region-specific fMRI activations dominated predictive importance. These results provide a proof-of-concept that Transformer-based multimodal fusion can capture complex, cross-domain patterns in TRD, supporting its potential for precision psychiatry. Future work will extend to real-world datasets and incorporate probability calibration to improve threshold-based classification performance in clinical settings.
%K Treatment-Resistant Depression %K Multimodal Fusion %K Transformer Architecture %K fMRI Analysis %K SNP Genomics %K Clinical Features %K Probability Calibration %K Computational Psychiatry %K Synthetic Data Generation %K Machine Learning in Mental Health %U http://www.oalib.com/paper/6877566