Unsupervised Disentanglement of Pitch and Timbre for Isolated Musical Instrument Sounds.

Timbre Categorical variable Leverage (statistics) Autoencoder Chord (peer-to-peer)
DOI: 10.5281/zenodo.4245532 Publication Date: 2020-10-11
ABSTRACT
Disentangling factors of variation aims to uncover latent variables that underlie the process data generation. In this paper, we propose a framework achieves unsupervised pitch and timbre disentanglement for isolated musical instrument sounds without relying on annotations or pre-trained neural networks. Our framework, based variational auto-encoders, takes as input spectral frame, encodes categorical continuous variables, respectively. The is then reconstructed by combining those variables. Under an training setting, major challenge encoders are tasked capture interest with distinct representations, access corresponding ground-truth labels. We therefore introduce auxiliary tasks objectives which leverage shifting strategy create surrogate labels, thereby encouraging timbre. Through ablation study analyze impact proposed objectives. evaluation shows efficacy learning disentangled verifies its applicability classification conditional synthesis.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....