All Title Author
Keywords Abstract

Publish in OALib Journal
ISSN: 2333-9721
APC: Only $99


Relative Articles


A Minimal Subset of Features Using Feature Selection for Handwritten Digit Recognition

DOI: 10.4236/jilsa.2017.94006, PP. 55-68

Keywords: Digit Recognition, Real Time, Feature Selection, Machine Learning, Classification, MNIST

Full-Text   Cite this paper   Add to My Lib


Many systems of handwritten digit recognition built using the complete set of features in order to enhance the accuracy. However, these systems lagged in terms of time and memory. These two issues are very critical issues especially for real time applications. Therefore, using Feature Selection (FS) with suitable machine learning technique for digit recognition contributes to facilitate solving the issues of time and memory by minimizing the number of features used to train the model. This paper examines various FS methods with several classification techniques using MNIST dataset. In addition, models of different algorithms (i.e. linear, non-linear, ensemble, and deep learning) are implemented and compared in order to study their suitability for digit recognition. The objective of this study is to identify a subset of relevant features that provides at least the same accuracy as the complete set of features in addition to reducing the required time, computational complexity, and required storage for digit recognition. The experimental results proved that 60% of the complete set of features reduces the training time up to third of the required time using the complete set of features. Moreover, the classifiers trained using the proposed subset achieve the same accuracy as the classifiers trained using the complete set of features.


[1]  Lee, H., et al. (2012) Machine Learning Techniques in Handwriting Recognition: Problems and Solutions. In: Kulkarni, S., Ed., Machine Learning Algorithms for Problem Solving in Computational Applications: Intelligent Techniques, IGI Global, University of Ballarat, Australia, 12-29.
[2]  Sonka, M., Hlavac, V. and Boyle, R. (2014) Image Processing, Analysis, and Machine Vision. Cengage Learning, Stamford, USA.
[3]  Jain, V., et al. (2016) Comparative Analysis of Machine Learning Algorithms in OCR. 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 16 March 2016, 1089-1092.
[4]  Kacalak, W., Stuart, K.D. and Majewski, M. (2007) Selected Problems of Intelligent Handwriting Recognition. In: Melin, P., Castillo, O., Ramírez, E.G., Kacprzyk, J. and Pedrycz, W., Eds., Analysis and Design of Intelligent Systems Using Soft Computing Techniques, Springer, Berlin Heidelberg, 298-305.
[5]  Keysers, D. (2007) Comparison and Combination of State-of-the-Art Techniques for Handwritten Character Recognition: Topping the Mnist Benchmark. arXiv preprint arXiv:0710.2231.
[6]  Liu, C.L., et al. (2003) Handwritten Digit Recognition: Benchmarking of State-of-the-Art Techniques. Pattern Recognition, 36, 2271-2285.
[7]  LeCun, Y., Cortes, C. and Burges, C.J.C. (1998) The MNIST Database of Handwritten Digits. New York, USA.
[8]  Simard, P.Y., Steinkraus, D. and Platt, J.C. (2003) Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis. ICDAR, 3, 958-962.
[9]  Holmstrom, L., et al. (1997) Neural and Statistical Classifiers-Taxonomy and Two Case Studies. IEEE Transactions on Neural Networks, 8, 5-17.
[10]  Liu, C.L. and Fujisawa, H. (2008) Classification and Learning Methods for Character Recognition: Advances and Remaining Problems. In: Marinai, S. and Fujisawa, H., Eds., Machine Learning in Document Analysis and Recognition, Springer, Berlin Heidelberg, 139-161.
[11]  Le Cun, Y., et al. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324.
[12]  Brownlee, J. (2016) Feature Selection for Machine Learning in Python.
[13]  Nnamoko, N., et al. (2014) Evaluation of Filter and Wrapper Methods for Feature Selection in Supervised Machine Learning. Age, 21, 33-2.
[14]  Han, J. and Kamber, M. (2000) Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems.
[15]  Engel, J. (1988) Polytomous Logistic Regression. Statistica Neerlandica, 42, 233-252.
[16]  Quinlan, J.R. (1987) Simplifying Decision Trees. International Journal of Man-Machine Studies, 27, 221-234.
[17]  Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32.
[18]  Friedman, J., Hastie, T. and Tibshirani, R. (2001) The Elements of Statistical Learning. Vol. 1, Springer Series in Statistics, Springer, Berlin.
[19]  Friedman, J.H. (2001) Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics, 29, 1189-1232.
[20]  Lauer, F., Suen, C.Y. and Bloch, G. (2007) A Trainable Feature Extractor for Handwritten Digit Recognition. Pattern Recognition, 40, 1816-1824.


comments powered by Disqus

Contact Us


WhatsApp +8615387084133

WeChat 1538708413