%0 Journal Article %T 基于Transformer的端到端深度符号回归方法
End-to-End Deep Symbolic Regression Method Based on Transformer %A 吴云浩 %A 田益民 %J Computer Science and Application %P 69-82 %@ 2161-881X %D 2025 %I Hans Publishing %R 10.12677/csa.2025.156158 %X 本研究提出了一种基于Transformer的端到端深度符号回归(TFSR)模型,用于通过自动生成数学表达式来解决符号回归问题。符号回归是一种无需先验模型形式的回归方法,旨在通过给定的数据推导出合理的数学表达式,具有较高的可解释性。随着深度学习的进步,深度符号回归(DSR)利用神经网络的强大数据处理能力,能够从高维数据中提取潜在的数学规律。本文提出的TFSR模型结合了进化算法和Transformer架构,通过自注意力机制有效处理数据中的长距离依赖关系,简化了符号回归过程并提升了效率与精度。此外,本研究还引入了前缀符号表示法和二叉树结构来高效表示和处理数学表达式,采用了分区采样方法生成均匀分布的采样数据。实验结果表明,TFSR模型在不同规模的数据集上具有良好的学习能力和泛化能力,尤其在处理复杂数学表达式时表现优异。通过基准测试与其他符号回归方法的对比,本研究的模型在复杂公式建模任务中展现了更强的表现,尤其在解决困难组问题时,TFSR模型表现出显著的优势。该研究为符号回归任务提供了一种新的端到端深度学习方法,尤其适用于复杂科学模型的自动构建,具有广泛的应用潜力。
This study proposes an end-to-end Transformer-based deep symbolic regression (TFSR) model to solve symbolic regression problems by automatically generating mathematical expressions. Symbolic regression is a regression method that does not require a priori model form. It aims to derive reasonable mathematical expressions from given data and has high interpretability. With the advancement of deep learning, deep symbolic regression (DSR) uses the powerful data processing capabilities of neural networks to extract potential mathematical laws from high-dimensional data. The TFSR model proposed in this paper combines evolutionary algorithms and Transformer architectures, effectively handles long-distance dependencies in data through self-attention mechanisms, simplifies the symbolic regression process, and improves efficiency and accuracy. In addition, this study also introduces prefix symbol representation and binary tree structure to efficiently represent and process mathematical expressions, and uses partition sampling methods to generate uniformly distributed sampling data. Experimental results show that the TFSR model has good learning and generalization capabilities on datasets of different sizes, especially when processing complex mathematical expressions. Through benchmark tests and comparisons with other symbolic regression methods, the model of this study shows stronger performance in complex formula modeling tasks, especially when solving difficult group problems, the TFSR model shows significant advantages. This study provides a new end-to-end deep learning method for symbolic regression tasks, which is particularly suitable for the automatic construction of complex scientific models and has broad application potential. %K 深度符号回归, %K 符号回归, %K Transformer模型
Deep Symbolic Regression %K Symbolic Regression %K Transformer Model %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=117424