全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Coherent Music Composition with Efficient Deep Learning Architectures and Processes

DOI: 10.4236/adr.2023.113015, PP. 189-206

Keywords: Transformer, Attention, Long-Term Structure, Architecture

Full-Text   Cite this paper   Add to My Lib

Abstract:

In recent years, significant advancements in music-generating deep learning models and neural networks have revolutionized the process of composing harmonically-sounding music. One notable innovation is the Music Transformer, a neural network that utilizes context generation and relationship tracking in sequential input. By leveraging transformer-based frameworks designed for handling sequential tasks and long-range functions, the Music Transformer captures self-reference through attention and excels at finding continuations of musical themes during training. This attention-based model offers the advantage of being easily trainable and capable of generating musical performances with long-term structure, as demonstrated by Google Brain’s implementation. In this study, I will explore various instances and applications of the Music Transformer, highlighting its ability to efficiently generate symbolic musical structures. Additionally, I will delve into another state-of-the-art model called TonicNet, featuring a layered architecture combining GRU and self-attention mechanisms. TonicNet exhibits particular strength in generating music with enhanced long-term structure, as evidenced by its superior performance in both objective metrics and subjective evaluations. To further improve TonicNet, I will evaluate its performance using the same metrics and propose modifications to its hyperparameters, architecture, and dataset.

References

[1]  Bhagat, D., Bhatt, N., & Kosta, Y. (2012). Adaptive Multi-Rate Wideband Speech Codec Based on CELP Algorithm: Architectural Study, Implementation & Performance Analysis. In 2012 International Conference on Communication Systems and Network Technologies (pp. 547-551). IEEE.
https://doi.org/10.1109/CSNT.2012.124
[2]  Briot, J.-P., Hadjeres, G., & Pachet, F.-D. (2019). Deep Learning Techniques for Music Generation—A Survey.
https://doi.org/10.48550/arXiv.1709.01620
[3]  Chu, H., Kim, J., Kim, S., Lim, H., Lee, H., Jin, S., Lee, J., Kim, T., & Ko, S. (2022). An Empirical Study on How People Perceive AI-Generated Music. In Proceedings of the 31st ACM International Conference on Information & Knowledge (pp. 304-314). Association for Computing Machinery.
https://doi.org/10.1145/3511808.3557235
[4]  Dua, M., Yadav, R., Mamgai, D., & Brodiya, S. (2020). An Improved RNN-LSTM Based Novel Approach for Sheet Music Generation. Procedia Computer Science, 171, 465-474.
https://doi.org/10.1016/j.procs.2020.04.049
[5]  Hsu, J.-L., & Chang, S.-J. (2021). Generating Music Transition by Using a Transformer-Based Model. Electronics, 10, Article 2276.
https://doi.org/10.3390/electronics10182276
[6]  Huang, C.-Z. A., Vaswani, A., Uszkoreit, J., Shazeer, N., Hawthorne, C., Dai, A., Hoffman, M. D., & Eck, D. (2019). Music Transformer: Generating Music with Long-Term Structure.
https://magenta.tensorflow.org/music-transformer
[7]  Jagannathan, A., Chandrasekaran, B., Dutta, S., Patil, U. R., & Eirinaki, M. (2022). Original Music Generation Using Recurrent Neural Networks with Self-Attention. In 2022 IEEE International Conference On Artificial Intelligence Testing (AITest) (pp. 56-63). IEEE.
https://doi.org/10.1109/AITest55621.2022.00017
[8]  Ji, S., Luo, J., & Yang, X. (2020). A Comprehensive Survey on Deep Music Generation: Multi-Level Representations, Algorithms, Evaluations, and Future Directions.
https://doi.org/10.48550/arXiv.2011.06801
[9]  Peracha, O. (2019). Improving Polyphonic Music Models with Feature-Rich Encoding.
https://doi.org/10.48550/arXiv.1911.11775
[10]  Shaw, P., Uszkoreit, J., & Vaswani, A. (2018). Self-Attention with Relative Position Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers) (pp. 464-468). Association for Computational Linguistics.
https://doi.org/10.18653/v1/N18-2074
[11]  Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15, 1929-1958.
[12]  Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need.
https://doi.org/10.48550/arXiv.1706.03762

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133