Music Generation Based on Convolution-LSTM

Yongjie Huang, Xiaofeng Huang, Qiakai Cai


In this paper, we propose a model that combines Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) for music generation. We first convert MIDI-format music file into a musical score matrix, and then establish convolution layers to extract feature of the musical score matrix. Finally, the output of the convolution layers is split in the direction of the time axis and input into the LSTM, so as to achieve the purpose of music generation. The result of the model was verified by comparison of accuracy, time-domain analysis, frequency-domain analysis and human-auditory evaluation. The results show that Convolution-LSTM performs better in music genertaion than LSTM, with more pronounced undulations and clearer melody.

Full Text:



Copyright (c) 2018 Yongjie Huang

License URL:

Computer and Information Science   ISSN 1913-8989 (Print)   ISSN 1913-8997 (Online)  Email:

Copyright © Canadian Center of Science and Education

To make sure that you can receive messages from us, please add the '' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.