OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Computer Science and Application 2020

融合LSTM和注意力机制的音乐分类推荐方法
Music Classification and Recommendation Method Combining LSTM and AM

DOI: 10.12677/CSA.2020.1012240, PP. 2280-2290

冯鹏宇, 陈平华, 申建芳

Full-Text Cite this paper Add to My Lib

Abstract:

针对音乐资源过于庞大，现有的音乐推荐方法分类准确度不高，对用户情感的识别较模糊导致人们在生活中难以寻找到偏好音乐的问题，本文提出一种将长短期记忆神经网络(Long Short-Term Memory, LSTM)与注意力机制(Attention Model, AM)相融合的音乐分类及推荐方法，该方法由音乐分类模型和音乐推荐模型两部分组成。首先对音频数据的声学特征进行捕获，构成含有多维特征的序列后，通过LSTM神经网络和注意力机制对音乐进行情感分类，接下来采集用户的历史收听记录，选取最近的十首歌曲并生成频谱图，结合CNN (Convolutional Neural Networks, CNN)对用户当前情感进行识别，提升推荐的高效性。实验部分将新提出的模型与其他传统音乐分类模型进行多组对比测试，结果显示与近年来现存的模型相比，新提出的模型明显提升了情感判断及用户情感识别的准确度，音乐推荐的准确度有所增强。
In view of the huge amount of music resources, the existing music recommendation methods have low classification accuracy, fuzzy recognition of user emotions, and low concentration of target data analysis, which makes it difficult to satisfy people’s preference for music in daily life. Due to demand and other issues, a music classification and recommendation method combining Long Short-Term Memory and Attention Model is proposed. The method consists of a music classification model and a music classification model. The recommended model consists of two parts. First to capture audio data of various acoustic characteristics, constitute a sequence containing multidimensional characteristics, through the LSTM Neural network classification of music emotion and attention mechanism; the next, gathering user history to record, select its most recent ten songs and generate the spectrum diagram, combined with CNN (Convolutional Neural Networks, CNN) to accurately identify the user’s current emotion, recommend the efficiency of ascension. The experimental part com-pares the new model with other traditional music classification models, and the results show that compared with the existing models in recent years, the new model significantly improves the accuracy of emotion judgment and user emotion recognition, and the accuracy of music recommendation is enhanced to some extent.

References

[1]	刘杨. 个性化音乐推荐系统的研究与实现[M]. 北京: 北京邮电大学, 2014.
[2]	陈雅茜. 音乐推荐系统及相关技术研究[J]. 计算机工程与应用, 2012, 48(18): 9-16.
[3]	Ness, S.R., Theocharis, A., Tzanetakis, G., et al. (2009) Im-proving Automatic Music Tag Annotation Using Stacked Generalization of Probabilistic SVM Outputs. International Conference on Multimedia, Vancouver, October 2009, 705-708. https://doi.org/10.1145/1631272.1631393
[4]	Huang, Y.S., Chou, S.Y. and Yang, Y.H. (2018) Pop Music Highlighter: Marking the Emotion Keypoints. Audio and Speech Processing.
[5]	Mirsamadi, S., Barsoum, E. and Zhang, C. (2017) Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 19 June 2017, 2227-2231. https://doi.org/10.1109/ICASSP.2017.7952552
[6]	Piczak, K.J. (2015) Environmental Sound Classi-fication with Convolutional Neural Networks. IEEE 25th International Workshop on Machine Learning for Signal Pro-cessing (MLSP), Boston, MA, 1-6. https://doi.org/10.1109/MLSP.2015.7324337
[7]	Zhang, Z., Xu, S., Cao, S., et al. (2018) Deep Convolutional Neural Network with Mixup for Environmental Sound Classification. Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Springer, Cham, 356-367. https://doi.org/10.1007/978-3-030-03335-4_31
[8]	Hinto, G., Deng, L., Yu, D., et al. (2012) Deep Neural Net-works for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Pro-cessing Magazine, 29, 82-97. https://doi.org/10.1109/MSP.2012.2205597
[9]	Van Den Oord, A., Dieleman, S., Zen, H., et al. (2016) Wavenet: A Generative Model for Raw Audio. SSW, 125. arXiv:1609.03499.
[10]	Palo, H.K., Mohanty, M. and Chandra, M. (2015) Computational Vision and Robotics. Advances in Intelligent Systems and Computing, 332, 63-70.
[11]	Roddy, C. (2001) Emotion Recognition in Human-Computer Interaction. Signal Processing Magazine, 18, 32-80. https://doi.org/10.1109/79.911197
[12]	张燕, 唐振民, 李燕萍. 面向推荐系统的音乐特征抽取[J]. 计算机工程与应用, 2011, 47(5): 130-133.
[13]	Zhang, L., Wu, D., Han, X., et al. (2016) Feature Extraction of Under-water Target Signal Using Mel Frequency Cepstrum Coefficients Based on Acoustic Vector Sensor. Journal of Sensors, 4, 1-11. https://doi.org/10.1155/2016/7864213
[14]	Gers, F.A., Schmidhube, J. and Cummins, F. (1999) Learning to Forget: Continual Prediction with LSTM. 9th International Conference on Artificial Neural Networks: ICANN’99, 850-855. https://doi.org/10.1049/cp:19991218

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

融合LSTM和注意力机制的音乐分类推荐方法Music Classification and Recommendation Method Combining LSTM and AM

融合LSTM和注意力机制的音乐分类推荐方法
Music Classification and Recommendation Method Combining LSTM and AM