JISE

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Journal of Information Science and Engineering, Vol. 38 No. 3, pp. 517-529

VAE+NN: Interpolation Composition by Direct Estimation of Encoded Vectors Against Linear Sampling of Latent Space

PABLO LÓPEZ DIÉGUEZ AND VON-WUN SOO
Department of Computer Science
National Tsing Hua University
Hsinchu, 300 Taiwan
E-mail: pablo@gapp.nthu.edu.tw; soo@cs.nthu.edu.tw

In this paper, we introduce a machine learning technique to estimate the vector encoded by a Variational Autoencoder (VAE) model, without the need of explicitly sampling the vector from the VAE’s latent space. The feasibility of our approach is evaluated in the field of music interpolation composition, by means of the Hsinchu Interpolation MIDI Dataset that was created. A novel dual architecture of VAE plus an additional neural network (VAE+NN) is proposed to generate a polyphonic harmonic bridge between two given songs, smoothly changing the pitches and dynamics of the interpolation. The interpolations generated by the VAE+NN model surpass a Random data baseline, a bidirectional LSTM model and the state-of-the-art interpolation approach in automatic music composition (VAE model with linear sampling of the latent space), in terms of reconstruction MSE loss. Furthermore, a subjective evaluation was done in order to ensure the validity of the metric-based results.

Keywords: VAE, variational autoencoders, interpolation, composition, polyphonic music, latent space, encoded vector

Retrieve PDF document (JISE_202203_02.pdf)