%0 Journal Article
%T 基于特征选择和加权的改进条件概率分布距离度量
Improved Conditional Probability Distribution Distance Measurement Based on Feature Selection and Weighting
%A 杨沛融
%A 胡桂开
%J Advances in Applied Mathematics
%P 798-809
%@ 2324-8009
%D 2025
%I Hans Publishing
%R 10.12677/aam.2025.144207
%X 为提高名词性属性实例间差异的识别精度,优化分类算法的准确率,在充分考虑属性间依赖关系下提出了一种基于特征选择和加权的改进条件概率分布距离度量方法。该方法首先利用对称不确定性构建了一个特征选择机制;其次,在此基础上计算属性与类的信息增益率,获得每个属性的权重,并计算加权距离;最后基于K-近邻算法对19个数据集进行仿真实验。结果表明:论文提出的距离度量有效提高了分类算法的性能。
To enhance the recognition accuracy of differences between instances of nominal attributes and to optimize the accuracy of classification algorithms, an improved conditional probability distribution distance measurement based on feature selection and weighting has been proposed, taking into full consideration the dependencies among attributes. Firstly, a feature selection mechanism is constructed by using symmetric uncertainty. Secondly, on this basis, the information gain ratio of attributes and classes is calculated, and the weight of each attribute is obtained. Subsequently, the weighted distance is computed. Finally, simulation experiments are conducted on 19 datasets based on the K-Nearest Neighbors algorithm. The results indicate that the distance measurement proposed in this paper effectively improves the performance of classification algorithms.
%K 特征选择,
%K 条件概率分布,
%K 名词性属性,
%K 信息增益率
Feature Selection
%K Conditional Probability Distribution
%K Nominal Attribute
%K Information Gain Ratio
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=113016