Machine learning is an integral technology many people utilize in all areas of human life. It is pervasive in modern living worldwide, and has multiple usages. One application is image classification, embraced across many spheres of influence such as business, finance, medicine, etc. to enhance produces, causes, efficiency, etc. This need for more accurate, detail-oriented classification increases the need for modifications, adaptations, and innovations to Deep Learning Algorithms. This article used Convolutional Neural Networks (CNN) to classify scenes in the CIFAR-10 database, and detect emotions in the KDEF database. The proposed method converted the data to the wavelet domain to attain greater accuracy and comparable efficiency to the spatial domain processing. By dividing image data into subbands, important feature learning occurred over differing low to high frequencies. The combination of the learned low and high frequency features, and processing the fused feature mapping resulted in an advance in the detection accuracy. Comparing the proposed methods to spatial domain CNN and Stacked Denoising Autoencoder (SDA), experimental findings revealed a substantial increase in accuracy.
References
[1]
Goodfellow, I., Bengio, Y. and Courville, A. (2016) Deep Learning. MIT Press, Cambridge.
[2]
Wernick, M.N., Yang, Y., Brankov, J.G., Yourganov, G. and Strother, S.C. (2010) Machine Learning in Medical Imaging. IEEE Signal Processing Magazine, 27, 25-38.
https://doi.org/10.1109/MSP.2010.936730
[3]
Arel, I., Rose, D.C. and Karnowski, T.P. (2010) Deep Machine Learning—A New Frontier in Artificial Intelligence Research. Computational Intelligence Magazine, 5, 13-18. https://doi.org/10.1109/MCI.2010.938364
[4]
Deng, L. and Yu, D. (2014) Deep Learning: Methods and Applications. Foundations and Trends in Signal Processing, 7, 197-387.
[5]
Williams, T. and Li, R. (2016) Advanced Imaged Classification using Wavelets and Convolutional Neural Networks. 15th IEEE ICMLA, Anaheim, 18-20 December 2016, 233-239.
[6]
Williams, T. and Li, R. (2016) SDA-Based Neural Network Approach to Digit Classification. IEEE Southeast Conference Proceedings, Norfolk, 30 March-3 April 2016, 1-6. https://doi.org/10.1109/SECON.2016.7506768
[7]
Krizhevsky, A. (2009) Learning Multiple Layers of Features from Tiny Images. Technical Report TR-2009, University of Toronto, Toronto.
[8]
Lundqvist, D., Flykt, A. and Öhman, A. (1998) The Karolinska Directed Emotional Faces—KDEF. Department of Clinical Neuroscience, Psychology Section, Karolinska Institutet.
[9]
Chui, C.K. (1992) An Introduction to Wavelets. Academic Press, New York.
[10]
Strang, G. and Strela, V. (1995) Short Wavelets and Matrix Dilation Equations. IEEE Transactions on Signal Processing, 43, 108-115.
https://doi.org/10.1109/78.365291
[11]
Rieder, P., Gotze, J. and Nossek, J.A. () Multiwavelet Transforms Based on Several Scaling Functions. Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, Philadelphia, 25-28 October 1994, 393-396. https://doi.org/10.1109/TFSA.1994.467330
[12]
Mallat, S.G. (1989) A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 674-693. https://doi.org/10.1109/34.192463
[13]
Nason, G.P. and Silverman, B.W. (1995) The Stationary Wavelet Transform and Some Statistical Applications. In: Antoniadis, A. and Oppenheim, G., Eds., Wavelets and Statistics, Lecture Notes in Statistics, Volume 103, 281-300.
[14]
Strang, G. and Nguyen, T. (1996) Wavelets and Filter Banks. Wellesley-Cambridge Press, Wellesley.
[15]
Burrus, C.S., Gonipath, R.A. and Guo, H. (1998) Introduction to Wavelets and Wavelet Transforms: A Primer. Prentice Hall, Englewood Cliffs.
[16]
Sihag, R., Sharma, R. and Setia, V. (2011) Wavelet Thresholding for Image De-Noising. International Conference on VLSI, Communication and Instrumentation, Kottayam, April 2011, 21-24.
[17]
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y. and Manzagol, P.-A. (2010) Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. Journal of Machine Learning Research, 11, 3371-3408.
[18]
Srivastava, N. (2013) Improving Neural Networks with Dropout. Master’s Thesis, Univ. of Toronto, Toronto.
[19]
Ng, A. (2015) Stacked Autoencoders.
http://ufldl.stanford.edu/wiki/index.php/Stacked_Autoencoders
[20]
Fukushima, K. (1980) Neocognitron: Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position. Biological Cybernetics, 36, 193-202.
[21]
Lecun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324.
https://doi.org/10.1109/5.726791
[22]
Nielsen, M. (2015) Neural Networks and Deep Learning. Determination Press.
[23]
Zeiler, M. and Fergus, R. (2013) Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. Proceedings of International Conference on Learning Representations, Scottsdale, 2-4 May 2013, 1-9.
[24]
Yu, D., Wang, H., Chen, P. and Wei, Z. (2014) Mixed Pooling for Convolutional Neural Networks. In: Rough Sets and Knowledge Technology, 8818 of Lecture Notes in Computer Science, Springer International Publishing, Berlin, 364-375.
[25]
Zeiler, M., Ranzato, M., Monga, R., Mao, M., Yang, K., Le, Q., Nguyen, P., Senior, A., Vanhoucke, V., Dean, J. and Hinton, G. (2013) On Rectified Linear Units for Speech Processing. ICASSP, Vancouver, 26-31 May 2013, 3517-3521.
https://doi.org/10.1109/ICASSP.2013.6638312
[26]
Ioffe, S. and Szegedy, C. (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning, Lille, 6-11 July 2015, 448-456.
[27]
Tang, J., Deng, C., Huang, G.-B. and Zhao, B. (2015) Compressed-Domain Ship Detection on Spaceborne Optical Image using Deep Neural Network and Extreme Learning Machine. IEEE Transactions on Geoscience and Remote Sensing, 53, 1174-1185. https://doi.org/10.1109/TGRS.2014.2335751
[28]
Shalabi, L.A., Shaaban, Z. and Kasabeh, B. (2006) Data Mining: A Preprocessing Engine. Journal of Computer Science, 2, 735-739.
https://doi.org/10.3844/jcssp.2006.735.739
[29]
Doukim, C., Dargham, J., Chekima, A. and Omatu, S. (2010) Combining Neural Networks for Skin Detection. Signal and Image Processing: An International Journal, 1, 1-11. https://doi.org/10.5121/sipij.2010.1201
[30]
Vedaldi, A. and Lenc, K. (2015) MatConvNet—Convolutional Neural Networks for MATLAB.
[31]
Bottou, L. (2010) Large-Scale Machine Learning with Stochastic Gradient Descent. International Conference on Computational Statistics, Paris, 22-27 August 2010, 177-187. https://doi.org/10.1007/978-3-7908-2604-3_16