In this paper, we explore the classification of vibration modes generated by handwriting on an optical desk using deep learning architectures. Three deep learning models—Long Short-Term Memory (LSTM) networks with attention mechanism, Video Vision Transformer (ViViT), and Long-term Recurrent Convolutional Network (LRCN)—were evaluated to determine the most effective method for analyzing time series patterns generated by a Michelson interferometer. The interferometer was used to detect vibration modes created by handwriting, capturing time-series data from the diffraction patterns. Among these models, the LSTM-Attention network achieved the highest validation accuracy, reaching up to 92%, outperforming both ViViT and LRCN. These findings highlight the potential of deep learning in material science for detecting and classifying vibration patterns. The powerful performance of the LSTM-Attention model suggests that it could be applied to similar classification tasks in related fields.
References
[1]
Zhang, X., Liang, X., Zhiyuli, A., Zhang, S., Xu, R. and Wu, B. (2019) AT-LSTM: An Attention-Based LSTM Model for Financial Time Series Prediction. IOP Conference Series: Materials Science and Engineering, 569, Article 052037. https://doi.org/10.1088/1757-899x/569/5/052037
[2]
Yu, Y. and Kim, Y. (2020) Attention-LSTM-Attention Model for Speech Emotion Recognition and Analysis of IEMOCAP Database. Electronics, 9, Article 713. https://doi.org/10.3390/electronics9050713
[3]
Hu, Z. (2021) Crude Oil Price Prediction Using CEEMDAN and LSTM-Attention with News Sentiment Index. Oil & Gas Science and Technology. Revue d’IFP Energies nouvelles, 76, Article No. 28. https://doi.org/10.2516/ogst/2021010
[4]
Bai, X. (2018) Text Classification Based on LSTM and Attention. 2018 Thirteenth International Conference on Digital Information Management, Berlin, 24-26 September 2018, 29-32.
[5]
Zhou, X., Wan, X. and Xiao, J. (2016) Attention-Based LSTM Network for Cross-Lingual Sentiment Classification. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, 1-5 November 2016, 247-256. https://doi.org/10.18653/v1/d16-1024
[6]
Liu, Z., Ning, J., Cao, Y., Wei, Y., Zhang, Z., Lin, S., et al. (2022) Video Swin Transformer. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 18-24 June 2022, 3192-3201. https://doi.org/10.1109/cvpr52688.2022.00320
[7]
Li, Y., Wu, C., Fan, H., Mangalam, K., Xiong, B., Malik, J., et al. (2022) MViTv2: Improved Multiscale Vision Transformers for Classification and Detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 18-24 June 2022, 4794-4804. https://doi.org/10.1109/cvpr52688.2022.00476
[8]
Yuan, H., Cai, Z., Zhou, H., Wang, Y. and Chen, X. (2021) Transanomaly: Video Anomaly Detection Using Video Vision Transformer. IEEE Access, 9, 123977-123986. https://doi.org/10.1109/access.2021.3109102
[9]
Wei, X., Zhou, L., Zhang, Z., Chen, Z. and Zhou, Y. (2019) Early Prediction of Epileptic Seizures Using a Long-Term Recurrent Convolutional Network. Journal of Neuroscience Methods, 327, Article 108395. https://doi.org/10.1016/j.jneumeth.2019.108395
[10]
Bu, S. and Cho, S. (2018) A Hybrid Deep Learning System of CNN and LRCN to Detect Cyberbullying from SNS Comments. In: Lecture Notes in Computer Science, Springer, 561-572. https://doi.org/10.1007/978-3-319-92639-1_47
[11]
Ganai, A.F. and Khursheed, F. (2023) Computationally Efficient Holistic Approach for Handwritten Urdu Recognition Using LRCN Model. International Journal of Intelligent Systems and Applications in Engineering, 11, 536-551. https://ijisae.org/index.php/IJISAE/article/view/2724
[12]
Lawall, J. and Kessler, E. (2000) Michelson Interferometry with 10 pm Accuracy. Review of Scientific Instruments, 71, 2669-2676. https://doi.org/10.1063/1.1150715
[13]
Freise, A., Chelkowski, S., Hild, S., Pozzo, W.D., Perreca, A. and Vecchio, A. (2009) Triple Michelson Interferometer for a Third-Generation Gravitational Wave Detector. Classical and Quantum Gravity, 26, Article 085012. https://doi.org/10.1088/0264-9381/26/8/085012
[14]
Monchalin, J.-P., Kelly, M.J., Thomas, J.E., Kurnit, N.A., Szöke, A., Zernike, F., et al. (1981) Accurate Laser Wavelength Measurement with a Precision Two-Beam Scanning Michelson Interferometer. Applied Optics, 20, 736-757. https://doi.org/10.1364/ao.20.000736
[15]
Park, S., Lee, J., Kim, Y. and Lee, B.H. (2020) Nanometer-Scale Vibration Measurement Using an Optical Quadrature Interferometer Based on 3×3 Fiber-Optic Coupler. Sensors, 20, Article 2665. https://doi.org/10.3390/s20092665
[16]
Cheng, J., Song, Q., Peng, H., Huang, J., Wu, H. and Jia, B. (2022) Optimization of VGG16 Algorithm Pattern Recognition for Signals of Michelson-Sagnac Interference Vibration Sensing System. Photonics, 9, Article 535. https://doi.org/10.3390/photonics9080535
[17]
Medina, R., Macancela, J., Lucero, P., Cabrera, D., Li, C., Cerrada, M., et al. (2019) A LSTM Neural Network Approach Using Vibration Signals for Classifying Faults in a Gearbox. 2019 International Conference on Sensing, Diagnostics, Prognostics, and Control, Beijing, 15-17 August 2019, 208-214. https://doi.org/10.1109/sdpc.2019.00045
[18]
Abdelmaksoud, M., Torki, M., El-Habrouk, M. and Elgeneidy, M. (2023) Convolutional-Neural-Network-Based Multi-Signals Fault Diagnosis of Induction Motor Using Single and Multi-Channels Datasets. Alexandria Engineering Journal, 73, 231-248. https://doi.org/10.1016/j.aej.2023.04.053
[19]
Yang, J., Peng, Y., Xie, J. and Wang, P. (2022) Remaining Useful Life Prediction Method for Bearings Based on LSTM with Uncertainty Quantification. Sensors, 22, Article 4549. https://doi.org/10.3390/s22124549
[20]
Donahue, J., Hendricks, L.A., Rohrbach, M., Venugopalan, S., Guadarrama, S., Saenko, K., et al. (2017) Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 677-691. https://doi.org/10.1109/tpami.2016.2599174
[21]
Santurkar, S., Tsipras, D., Ilyas, A. and Madry, A. (2018) How Does Batch Normalization Help Optimization? Advances in Neural Information Processing Systems, 31, 2483-2493.
[22]
Arnab, A., Dehghani, M., Heigold, G., Sun, C., Lucic, M. and Schmid, C. (2021) Vivit: A Video Vision Transformer. 2021 IEEE/CVF International Conference on Computer Vision, Montreal, 10-17 October 2021, 6816-6826. https://doi.org/10.1109/iccv48922.2021.00676
[23]
Wang, Y., Huang, M., Zhu, x. and Zhao, L. (2016) Attention-Based LSTM for Aspect-Level Sentiment Classification. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, 1-5 November 2016, 606-615. https://doi.org/10.18653/v1/d16-1058
[24]
Müller, R., Kornblith, S. and Hinton, G.E. (2019) When Does Label Smoothing Help? Advances in Neural Information Processing Systems, 32, 4694-4703.
[25]
Lin, M., Chen, Q. and Yan, S. (2013) Network in Network. https://doi.org/10.48550/arXiv.1312.4400
[26]
Keser, S. and Gezer, E. (2025) Comparative Analysis of Speaker Identification Performance Using Deep Learning, Machine Learning, and Novel Subspace Classifiers with Multiple Feature Extraction Techniques. Digital Signal Processing, 156, Article 104811. https://doi.org/10.1016/j.dsp.2024.104811