全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Dynamic Conditional Feature Screening: A High-Dimensional Feature Selection Method Based on Mutual Information and Regression Error

DOI: 10.4236/ojs.2025.152011, PP. 199-242

Keywords: High-Dimensional Feature Screening, Conditional Mutual Information, Regression Error Difference, Dynamic Weighting, Dynamic Thresholding, Macroeconomic Forecasting, FRED-MD Dataset

Full-Text   Cite this paper   Add to My Lib

Abstract:

Current high-dimensional feature screening methods still face significant challenges in handling mixed linear and nonlinear relationships, controlling redundant information, and improving model robustness. In this study, we propose a Dynamic Conditional Feature Screening (DCFS) method tailored for high-dimensional economic forecasting tasks. Our goal is to accurately identify key variables, enhance predictive performance, and provide both theoretical foundations and practical tools for macroeconomic modeling. The DCFS method constructs a comprehensive test statistic by integrating conditional mutual information with conditional regression error differences. By introducing a dynamic weighting mechanism, DCFS adaptively balances the linear and nonlinear contributions of features during the screening process. In addition, a dynamic thresholding mechanism is designed to effectively control the false discovery rate (FDR), thereby improving the stability and reliability of the screening results. On the theoretical front, we rigorously prove that the proposed method satisfies the sure screening property and rank consistency, ensuring accurate identification of the truly important feature set in high-dimensional settings. Simulation results demonstrate that under purely linear, purely nonlinear, and mixed dependency structures, DCFS consistently outperforms classical screening methods such as SIS, CSIS, and IG-SIS in terms of true positive rate (TPR), false discovery rate (FDR), and rank correlation. These results highlight the superior accuracy, robustness, and stability of our method. Furthermore, an empirical analysis based on the U.S. FRED-MD macroeconomic dataset confirms the practical value of DCFS in real-world forecasting tasks. The experimental results show that DCFS achieves lower prediction errors (RMSE and MAE) and higher R2 values in forecasting GDP growth. The selected key variables—including the Industrial Production Index (IP), Federal Funds Rate, Consumer Price Index (CPI), and Money Supply (M2)—possess clear economic interpretability, offering reliable support for economic forecasting and policy formulation.

References

[1]  Fan, J. and Lv, J. (2008) Sure Independence Screening for Ultrahigh Dimensional Feature Space. Journal of the Royal Statistical Society Series B: Statistical Methodology, 70, 849-911.
https://doi.org/10.1111/j.1467-9868.2008.00674.x
[2]  Li, R., Zhong, W. and Zhu, L. (2012) Feature Screening via Distance Correlation Learning. Journal of the American Statistical Association, 107, 1129-1139.
https://doi.org/10.1080/01621459.2012.695654
[3]  Shao, X. and Zhang, J. (2014) Martingale Difference Correlation and Its Use in High-Dimensional Variable Screening. Journal of the American Statistical Association, 109, 1302-1318.
https://doi.org/10.1080/01621459.2014.887012
[4]  Mai, Q. and Zou, H. (2015) The Fused Kolmogorov Filter: A Nonparametric Model-Free Screening Method. The Annals of Statistics, 43, 1471-1497.
https://doi.org/10.1214/14-aos1303
[5]  Ni, L. and Fang, F. (2016) Entropy-based Model-Free Feature Screening for Ultrahigh-Dimensional Multiclass Classification. Journal of Nonparametric Statistics, 28, 515-530.
https://doi.org/10.1080/10485252.2016.1167206
[6]  Zhu, Y.D., Chen, X.R. and Li, Q.P. (2021) Selection of Ultra High Dimensional Variables Based on Information Gain Rate. Statistics and Decision Making, 37, 18-21.
[7]  Fan, J., Li, R., Zhang, C.H. and Zou, H. (2020) Statistical Foundations of Data Science. CRC Press.
[8]  Zeng, J. and Zhou, J.J. (2017) A Review of High-Dimensional Data Variable Selection Methods. Mathematical Statistics and Management, 36, 678-692.
[9]  Barut, E., Fan, J. and Verhasselt, A. (2016) Conditional Sure Independence Screening. Journal of the American Statistical Association, 111, 1266-1277.
https://doi.org/10.1080/01621459.2015.1092974
[10]  Lu, J. and Lin, L. (2017) Model-Free Conditional Screening via Conditional Distance Correlation. Statistical Papers, 61, 225-244.
https://doi.org/10.1007/s00362-017-0931-7
[11]  Zhou, Y., Liu, J., Hao, Z., et al. (2018) Model-Free Conditional Feature Screening with Exposure Variables. arXiv: 1804.03637.
[12]  Xiong, W., Pan, H., Wang, J. and Tian, M. (2023) An Efficient Model-Free Approach to Interaction Screening for High Dimensional Data. Statistics in Medicine, 42, 1583-1605.
https://doi.org/10.1002/sim.9688
[13]  Wang, P. and Lin, L. (2022) Conditional Characteristic Feature Screening for Massive Imbalanced Data. Statistical Papers, 64, 807-834.
https://doi.org/10.1007/s00362-022-01342-8
[14]  Yuan, Z. and Dong, D.M. (2022) Near-Infrared Spectroscopy Measurement of Contrastive Variational Autoencoder and Its Application in the Detection of Liquid Sample. Spectroscopy and Spectral Analysis, 42, 3637-3641.
[15]  Pan, S., Li, Y., Wu, Z., et al. (2024) Establishment of a Predictive Nomogram for Clinical Pregnancy Rate in Patients with Endometriosis Undergoing Fresh Embryo Transfer. Journal of Southern Medical University, 44, 1407-1415.
[16]  Guo, X., Ren, H., Zou, C. and Li, R. (2022) Threshold Selection in Feature Screening for Error Rate Control. Journal of the American Statistical Association, 118, 1773-1785.
https://doi.org/10.1080/01621459.2021.2011735
[17]  Zhang, C., Bengio, S., Hardt, M., Recht, B. and Vinyals, O. (2017) Understanding Deep Learning Requires Rethinking Generalization. arXiv: 1611.03530.
[18]  Kingma, D.P. and Welling, M. (2014) Auto-Encoding Variational Bayes. arXiv: 1312.6114.
[19]  Ji, P. and Jin, J. (2012) UPS Delivers Optimal Phase Diagram in High-Dimensional Variable Selection. The Annals of Statistics, 40, 73-103.
https://doi.org/10.1214/11-aos947
[20]  Zhou, S., Wang, T. and Huang, Y. (2022) Feature Screening via Mutual Information Learning Based on Nonparametric Density Estimation. Journal of Mathematics, 2022, Article ID: 7584374.
https://doi.org/10.1155/2022/7584374
[21]  Ellingsen, J., Larsen, V.H. and Thorsrud, L.A. (2021) News Media versus FRED‐MD for Macroeconomic Forecasting. Journal of Applied Econometrics, 37, 63-81.
https://doi.org/10.1002/jae.2859

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133