Many systems of handwritten digit recognition built using the complete set of
features in order to enhance the accuracy. However, these systems lagged in
terms of time and memory. These two issues are very critical issues especially
for real time applications. Therefore, using Feature Selection (FS) with suitable
machine learning technique for digit recognition contributes to facilitate
solving the issues of time and memory by minimizing the number of features
used to train the model. This paper examines various FS methods with several
classification techniques using MNIST dataset. In addition, models of different
algorithms (i.e. linear, non-linear, ensemble, and deep learning) are implemented
and compared in order to study their suitability for digit recognition.
The objective of this study is to identify a subset of relevant features that
provides at least the same accuracy as the complete set of features in addition
to reducing the required time, computational complexity, and required storage
for digit recognition. The experimental results proved that 60% of the
complete set of features reduces the training time up to third of the required
time using the complete set of features. Moreover, the classifiers trained using
the proposed subset achieve the same accuracy as the classifiers trained using
the complete set of features.
References
[1]
Lee, H., et al. (2012) Machine Learning Techniques in Handwriting Recognition: Problems and Solutions. In: Kulkarni, S., Ed., Machine Learning Algorithms for Problem Solving in Computational Applications: Intelligent Techniques, IGI Global, University of Ballarat, Australia, 12-29.
https://doi.org/10.4018/978-1-4666-1833-6.ch002
[2]
Sonka, M., Hlavac, V. and Boyle, R. (2014) Image Processing, Analysis, and Machine Vision. Cengage Learning, Stamford, USA.
[3]
Jain, V., et al. (2016) Comparative Analysis of Machine Learning Algorithms in OCR. 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 16 March 2016, 1089-1092.
[4]
Kacalak, W., Stuart, K.D. and Majewski, M. (2007) Selected Problems of Intelligent Handwriting Recognition. In: Melin, P., Castillo, O., Ramírez, E.G., Kacprzyk, J. and Pedrycz, W., Eds., Analysis and Design of Intelligent Systems Using Soft Computing Techniques, Springer, Berlin Heidelberg, 298-305.
https://doi.org/10.1007/978-3-540-72432-2_30
[5]
Keysers, D. (2007) Comparison and Combination of State-of-the-Art Techniques for Handwritten Character Recognition: Topping the Mnist Benchmark. arXiv preprint arXiv:0710.2231.
[6]
Liu, C.L., et al. (2003) Handwritten Digit Recognition: Benchmarking of State-of-the-Art Techniques. Pattern Recognition, 36, 2271-2285.
https://doi.org/10.1016/S0031-3203(03)00085-2
[7]
LeCun, Y., Cortes, C. and Burges, C.J.C. (1998) The MNIST Database of Handwritten Digits. New York, USA. http://yann.lecun.com/exdb/mnist/
[8]
Simard, P.Y., Steinkraus, D. and Platt, J.C. (2003) Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis. ICDAR, 3, 958-962.
[9]
Holmstrom, L., et al. (1997) Neural and Statistical Classifiers-Taxonomy and Two Case Studies. IEEE Transactions on Neural Networks, 8, 5-17.
https://doi.org/10.1109/72.554187
[10]
Liu, C.L. and Fujisawa, H. (2008) Classification and Learning Methods for Character Recognition: Advances and Remaining Problems. In: Marinai, S. and Fujisawa, H., Eds., Machine Learning in Document Analysis and Recognition, Springer, Berlin Heidelberg, 139-161. https://doi.org/10.1007/978-3-540-76280-5_6
[11]
Le Cun, Y., et al. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324.
[12]
Brownlee, J. (2016) Feature Selection for Machine Learning in Python.
http://machinelearningmastery.com/feature-selection-machine-learning-python/
[13]
Nnamoko, N., et al. (2014) Evaluation of Filter and Wrapper Methods for Feature Selection in Supervised Machine Learning. Age, 21, 33-2.
[14]
Han, J. and Kamber, M. (2000) Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems.