OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Journal of Data Analysis and Information Processing 2024

Non-Linear Matrix Completion

DOI: 10.4236/jdaip.2024.121007, PP. 115-137

Fengrui Zhang, Randy C. Paffenroth, David Worth

Keywords: Matrix Completion, Data Pipeline, Machine Learning

Full-Text Cite this paper Add to My Lib

Abstract:

Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking stock price as example, ranging from prices post-IPO to values before a company’s collapse, or instances where certain data points are missing due to stock suspension. In this paper, we propose a novel approach using Nonlinear Matrix Completion (NIMC) and Deep Matrix Completion (DIMC) to predict associations, and conduct experiment on financial data between dates and stocks. Our method leverages various types of stock observations to capture latent factors explaining the observed date-stock associations. Notably, our approach is nonlinear, making it suitable for datasets with nonlinear structures, such as the Russell 3000. Unlike traditional methods that may suffer from information loss, NIMC and DIMC maintain nearly complete information, especially in high-dimensional parameters. We compared our approach with state-of-the-art linear methods, including Inductive Matrix Completion, Nonlinear Inductive Matrix Completion, and Deep Inductive Matrix Completion. Our findings show that the nonlinear matrix completion method is particularly effective for handling nonlinear structured data, as exemplified by the Russell 3000. Additionally, we validate the information loss of the three methods across different dimensionalities.

References

[1]	Weidman, L., Schenker, N. and Treiman, D.J. (1993) Analyses of Public Use Decennial Census Data with Multiply Imputed Industry and Occupation Codes. Journal of the Royal Statistical Society: Series C (Applied Statistics), 42, 545-556. https://doi.org/10.2307/2986331
[2]	Cai, J.-F., Candès, E.J. and Shen, Z.W. (2010) Singular Value Thresholding Algorithm for Matrix Completion. SIAM Journal on Optimization, 20, 1956-1982. https://doi.org/10.1137/080738970
[3]	Dhillon, I.S. and Natarajan, N. (2014) Inductive Matrix Completion for Predicting Gene-Disease Associations. Bioinformatics, 30, i60-i68. https://doi.org/10.1093/bioinformatics/btu269
[4]	Paffenroth, R.C. (2019) DS502 Statistics Method in Data Science Lecture Notes. Worcester Polytechnic Institute, Worcester, MA.
[5]	Lee, K. (2019) CS539 Machine Learning Lecture Notes. Worcester Polytechnic Institute. Worcester, MA.
[6]	Paffenroth, R., Bahadur, N. and Gajamannage, K. (2018) A Study of Russell 3000 Dimensionality Using Non-Linear Dimensionality Reduction Techniques. Bioinformatics.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133