Papers Read on AI

Papers Read on AI header image 1
October 16, 2022  

GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models

October 16, 2022

We propose AudioStyleGAN (ASGAN), a new generative adversarial network (GAN) for unconditional speech synthesis. As in the StyleGAN family of image synthesis models, ASGAN maps sampled noise to a disentangled latent vector which is then mapped to a sequence of audio features so that signal aliasing is suppressed at every layer. To successfully train ASGAN, we introduce a number of new techniques, including a modiļ¬cation to adaptive discriminator augmentation to probabilistically skip discriminator updates. ASGAN achieves state-of-the-art results in unconditional speech synthesis on the Google Speech Commands dataset.

2022: Matthew Baas, H. Kamper