%0 Journal Article
%T 基于逆回归方法的分布式特征筛选
Distributed Feature Screening via Inverse Regression
%A 张妍
%A 张俊英
%J Advances in Applied Mathematics
%P 344-351
%@ 2324-8009
%D 2025
%I Hans Publishing
%R 10.12677/aam.2025.141034
%X 本文我们提出了一个通过逆回归估计实现大数据设置的分布式筛选框架。本着分而治之的思想,本文提出的框架可用分布估计条件方差的逆回归模型来表达相关关系。通过分量估计的聚合,我们得到了一个最终的逆条件方差估计,可以很容易地用于筛选特征。该框架支持分布式存储和并行计算,因此在计算上具有吸引力。由于分量参数的无偏分布估计,最终的聚合估计具有较高的精度,且对数据段数量m不敏感。在一般条件下,我们证明了聚合估计器在概率收敛界和均方误差率方面与集中估计器一样有效;相应的筛选过程对广泛的相关度量具有一定的筛选特性。
In this paper, we propose a distributed screening framework for big data setup via inverse regression estimator. In the spirit of divide-and-conquer, the proposed framework expresses the dependent relative by inverse regression model in which can be distributively estimated inverse conditional variance. With the component estimates aggregated, we obtain a final inverse conditional variance estimator that can be readily used for screening features. This framework enables distributed storage and parallel computing and thus is computationally attractive. Due to the unbiased distributive estimation of the component parameters, the final aggregated estimate achieves a high accuracy that is insensitive to the number of data segments m. Under mild conditions, we show that the aggregated estimator is as efficient as the centralized estimator in terms of the probability convergence bound and the mean squared error rate; the corresponding screening procedure enjoys sure screening property for a wide range of correlation measures.
%K 超高维,
%K Gini相关系数,
%K 变量筛选,
%K 特征排序
Ultrahigh Dimension
%K Gini Correlation Coefficient
%K Variable Screening
%K Feature Ranking
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=106426