Extension neural network (ENN) is a new neural network that is a combination of extension theory and artificial neural network (ANN). The learning algorithm of ENN is based on supervised learning algorithm. One of important issues in the field of classification and recognition of ENN is how to achieve the best possible classifier with a small number of labeled training data. Training data selection is an effective approach to solve this issue. In this work, in order to improve the supervised learning performance and expand the engineering application range of ENN, we use a novel data selection method based on shadowed sets to refine the training data set of ENN. Firstly, we use clustering algorithm to label the data and induce shadowed sets. Then, in the framework of shadowed sets, the samples located around each cluster centers (core data) and the borders between clusters (boundary data) are selected as training data. Lastly, we use selected data to train ENN. Compared with traditional ENN, the proposed improved ENN (IENN) has a better performance. Moreover, IENN is independent of the supervised learning algorithms and initial labeled data. Experimental results verify the effectiveness and applicability of our proposed work. 1. Instruction There are some recognition and classification problems whose features are defined over an interval of values in our world. For example, boys can be defined as a class of men from age 1 to 12, and the permitted operation voltages of a specified motor may be between 100 and 120?V. For these problems, it is difficult to implement an appropriate classification method using the current artificial neural networks (ANNs). Therefore, a new topology of neural network, extension neural network (ENN), is proposed by Wang and Hung [1] to solve these problems. This new neural network is a combination of extension theory [2] and ANN. The ENN uses a modified extension distance to measure the similarity between the objects and the class centers. It can quickly and stably learn to categorize input patterns and permit the adaptive processes to access significant new information. Moreover, the ENN has shorter learning time and a simpler structure than traditional ANNs. There have been some successful applications based on ENN in the field of pattern recognition, fault diagnosis, classification, cluster analysis, and so on [1, 3–8]. Just like any other ANNs based on supervised learning, the training data set is the most important and critical factor to ensure the supervised learning performance of ENN. In other words, supervised
References
[1]
M. H. Wang and C. P. Hung, “Extension neural network and its applications,” Neural Networks, vol. 16, no. 5-6, pp. 779–784, 2003.
[2]
W. Cai, “Extension theory and its application,” Chinese Science Bulletin, vol. 44, no. 17, pp. 1538–1548, 1999.
[3]
M.-H. Wang, “Partial discharge pattern recognition of current transformers using an ENN,” IEEE Transactions on Power Delivery, vol. 20, no. 3, pp. 1984–1990, 2005.
[4]
H.-C. Chen, F.-C. Gu, and M.-H. Wang, “A novel extension neural network based partial discharge pattern recognition method for high-voltage power apparatus,” Expert Systems with Applications, vol. 39, no. 3, pp. 3423–3431, 2012.
[5]
M.-H. Wang, K.-H. Chao, W.-T. Sung, and G.-J. Huang, “Using ENN-1 for fault recognition of automotive engine,” Expert Systems with Applications, vol. 37, no. 4, pp. 2943–2947, 2010.
[6]
Y. Zhou, W. Pedrycz, and X. Qian, “Application of extension neural network to safety status pattern recognition of coal mines,” Journal of Central South University of Technology, vol. 18, no. 3, pp. 633–641, 2011.
[7]
Y.-H. Lai and H.-C. Che, “Modeling patent legal value by Extension Neural Network,” Expert Systems with Applications, vol. 36, no. 7, pp. 10520–10528, 2009.
[8]
M.-H. Wang, “Application of extension neural network type-1 to fault diagnosis of electronic circuits,” Mathematical Problems in Engineering, vol. 2012, Article ID 352749, 12 pages, 2012.
[9]
X. J. Zhu, “Semi-supervised learning literature survey,” Tech. Rep. 1530, Computer Science, University of Wisconsin Madison, Madison, Wisc, USA, 2008.
[10]
I. Czarnowski, “Cluster-based instance selection for machine classification,” Knowledge and Information Systems, vol. 30, no. 1, pp. 113–133, 2012.
[11]
J. R. Cano, F. Herrera, and M. Lozano, “On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining,” Applied Soft Computing Journal, vol. 6, no. 3, pp. 323–332, 2006.
[12]
D. Randall Wilson and T. R. Martinez, “Reduction techniques for instance-based learning algorithms,” Machine Learning, vol. 38, no. 3, pp. 257–286, 2000.
[13]
K. Yu, X. Xiaowei, M. Ester, and H.-P. Kriegel, “Feature weighting and instance selection for collaborative filtering: an information-theoretic approach,” Knowledge and Information Systems, vol. 5, no. 2, pp. 201–224, 2004.
[14]
S. Oh, M. S. Lee, and B.-T. Zhang, “Ensemble learning with active example selection for imbalanced biomedical data classification,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 2, pp. 316–325, 2011.
[15]
W. Pedrycz, “Interpretation of clusters in the framework of shadowed sets,” Pattern Recognition Letters, vol. 26, no. 15, pp. 2439–2449, 2005.
[16]
W. Pedrycz, “From fuzzy sets to shadowed sets: interpretation and computing,” International Journal of Intelligent Systems, vol. 24, no. 1, pp. 48–61, 2009.
[17]
N. S. Philip, “Optimal selection of training data for the difference boosting neural networks,” in Proceedings of the iAstro, pp. 1–9, Nice, France, October 2003.
[18]
M. Plutowski and H. White, “Selecting concise training sets from clean data,” IEEE Transactions on Neural Networks, vol. 4, no. 2, pp. 305–318, 1993.
[19]
C. E. Pedreira, “Learning vector quantization with training data selection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 1, pp. 157–162, 2006.
[20]
D. Guan, W. Yuan, Y.-K. Lee, A. Gavrilov, and S. Lee, “Improving supervised learning performance by using fuzzy clustering method to select training data,” Journal of Intelligent and Fuzzy Systems, vol. 19, no. 4-5, pp. 321–334, 2008.
[21]
K. Hara and K. Nakayama, “Training data selection method for generalization by multilayer neural networks,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E81-A, no. 3, pp. 374–381, 1998.
[22]
B. Bolat and T. Yildirim, “A data selection method for probabilistic neural network,” Journal of Electrical and Electronics Engineering, vol. 4, no. 2, pp. 1137–1140, 2004.
[23]
A. Lyhyaoui, M. Martínez, I. Mora, M. Vázquez, J.-L. Sancho, and A. R. Figueiras-Vidal, “Sample selection via clustering to construct support vector-like classifiers,” IEEE Transactions on Neural Networks, vol. 10, no. 6, pp. 1474–1481, 1999.
[24]
G. Schohn and D. Cohn, “Less is more: active learning with support vector machines,” in Proceedings of the 17th International Conference on Machine Learning, pp. 839–846, 2000.
[25]
C. H. Li, C. M. Liu, and G. Cai, “Approach to eliminating morbid samples in forward neural networks,” Journal of Jilin University (Information Science Edition), vol. 27, no. 5, pp. 514–519, 2009.
[26]
J. J. Ai, C. G. Zhou, and C. C. Gong, “Algorithm of voting to eliminate morbid samples in forward feed neural networks,” Mini-Micro System, vol. 11, no. 11, pp. 1371–1374, 2002.
[27]
F. Muhlenbach, S. Lallich, and D. A. Zighed, “Identifying and handling mislabeled instances,” Journal of Intelligent Information Systems, vol. 22, no. 1, pp. 89–109, 2004.
[28]
Y. Zhou, H. Su, and H. Zhang, “A novel data selection method for improving supervised learning performance of neural network,” Information, vol. 15, no. 11A, pp. 4513–4518, 2012.
[29]
D. D. Lewis and W. A. Gale, “A sequential algorithm for training text classifiers,” in Proceedings of the 17th ACM International Conference on Research and Development in Information Retrieval, pp. 3–12, 1994.
[30]
H. T. Nguyen and A. Smeulders, “Active learning using pre-clustering,” in Proceedings of the 21st International Conference on Machine Learning (ICML '04), pp. 623–630, July 2004.
[31]
Z. Xu, K. Yu, V. Tresp, and J. Wang, “Representative sampling for text classification using support vector machines,” in Proceedings of the 25th European Conference on Information Retrieval Research, pp. 393–407, 2003.
[32]
M. Li and I. K. Sethi, “Confidence-based active learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 1251–1261, 2006.
[33]
Q.-K. Cao, X.-Y. Ren, and K.-D. Liu, “Research on unascertained clusters on the gas emission of the working face,” Journal of the China Coal Society, vol. 31, no. 3, pp. 337–341, 2006.