|
控制理论与应用 2017
基于EasyEnsemble的化工过程故障诊断性能改进
|
Abstract:
化工过程故障诊断中样本数据分布不均衡现象普遍存在. 在使用不均衡样本作为训练集建立各类故障诊 断分类器时, 易出现分类器的识别率偏置于多数类样本的结果, 由此产生虽正常状态易识别, 但更受关注的故障状 态却难以被诊断的现象. 针对该问题, 本文提出一种基于EasyEnsemble思想的主元分析–支持向量机(EasyEnsemble based principle component analysis–support vector machine, EEPS)故障诊断算法, 通过欠采样方法抽取多数类样本子 集组建多个新的均衡数据样本集, 使用主元分析(principle component analysis, PCA)进行特征提取并使用支持向量 机(support vector machine, SVM) 算法进行训练, 得到多个基于SVM的故障诊断分类器, 然后使用Adaboost算法集成 最终的分类, 从而提高故障诊断准确性. 所提方法被用于TE(Tenessee Eastman)化工过程, 实验结果表明, EEPS算法 能够有效提高分类器在不均衡数据集上的诊断性能和预报能力.
Imbalanced dataset is a phenomenon existing massively in the field of chemical process fault diagnosis. The recognition rate of the classifier will be biased to the majority class samples when using imbalanced dataset as the training set. As a result, the normal state is easy to identify, while the fault state people concerned are difficult to be diagnosed. In this paper, an EasyEnsemble based principle component analysis-support vector machine (EEPS) fault diagnosis algorithm is proposed. After constructing a number of balanced subsets by under-sampling from the majority class, principle component analysis (PCA) is used for feature extraction and a number of support vector machine (SVM) sub-classifiers are trained accordingly. Then an integral classifier is developed by using the Adaboost algorithm. This integral classifier can be used for fault diagnosis and prognosis. The experimental results on Tenessee Eastman (TE) chemical process show that the proposed EEPS improves the diagnosis and prognosis performance on the imbalanced dataset.