This is a small page accompanying a bachelor thesis on kick drum generation, full text.
Here you can listen to some samples from each of the models.
We tried to pick samples that were representative of the quality that can be expected when running these models on random input data,
but the quality ofcourse varies from sample to sample.
We describe our data augmentation method in section 3.2 of the report.
Our baseline model is the WaveGAN model, trained on the original dataset:
We then trained the same model, but on the augmented dataset to see if 'more' data would improve its output:
In section 4.1.3 we introduce our progressively growing versions of WaveGAN.
The first two below are the $\alpha$-fading version, and the second two are the $\eta$-fading version.
We compared our Progressive WaveGAN to a more standard PGAN architecture. The first two below are the 'mag-if' representation, and the next two are the 'complex' representation.
Now to some failed experiments, the two samples below are from WaveRNN (left) and WaveNet (right).